Restore¶
The template offers a way to restore a previous run from the configuration.
The relevant configuration block is in conf/train/default.yml:
ckpt_or_run_path¶
The ckpt_or_run_path can be a path towards a Lightning Checkpoint or the run identifiers w.r.t. the logger.
In case of W&B as a logger, they are called run_path and are in the form of entity/project/run_id.
Warning
If ckpt_or_run_path points to a checkpoint, that checkpoint must have been saved with
this template, because additional information are attached to the checkpoint to guarantee
a correct restore. These include the run_path itself and the whole configuration used.
mode¶
We support 4 different modes for restoring an experiment:
mode no restore happens, and ckpt_or_run_path is ignored.
Use Case
This is the default option and allows the user to train the model from scratch logging into a new run.
mode only the model weights are restored, both the Trainer state and the logger run
are not restored.
Use Case
As the name suggest, one of the most common use case is when fine tuning a trained model logging into a new run with a novel training regimen.
mode the training continues from the checkpoint restoring the Trainer state but the logging does not.
A new run is created on the logger dashboard.
Use Case
Perform different tests in separate logging runs branching from the same trained model.
Restore summary
| null | finetune | hotstart | continue | |
|---|---|---|---|---|
| Model weights | ||||
| Trainer state | ||||
| Logging run |