Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

142: Load best model instead of last one #151

Merged
merged 2 commits into from
Feb 21, 2022
Merged

142: Load best model instead of last one #151

merged 2 commits into from
Feb 21, 2022

Conversation

Johansmm
Copy link
Contributor

Proposed saving/reading of training checkpoints. Steps for correct use:

  1. Launch the training. New flags were implemented to complement the saving of checkpoints:
    a. save_top_k : The user can now choose the amount of best_checkpoints to store. He can also decide to save all checkpoints, using --save_top_k -1.
    b. monitor: Metric stored in the name of the checkpoints. Default val_loss.
    c. New storage format : "{epoch}-{step}-{" + monitor + ":.4f}".

  2. Perform the checkpoint reading specified from the new flag checkpoint (only defined in mode=eval). Three ways are taken into account:
    a. --checkpoint last (default): Reading of the last checkpoint.
    b. --checkpoint *.ckpt : Reading of a specific checkpoint.
    c. --checkpoint best : Try to read the checkpoint with the best monitor, in two steps:

    1. Direct reading on the file name of the checkpoints in the folder.
    2. Reading all checkpoints (torch.load(ckpt)), extracting the monitor value from : ckpt['callbacks'][ModelCheckpoint]['current_score'].

    If none of the steps is possible, it throws an error.

Note:
With the implementation of 1.c, step 2.c.ii would not be necessary. It was implemented as a transition step between previous projects to this new feature (compatibility).

Closes #142

@Johansmm Johansmm added enhancement New feature or request alonet labels Feb 18, 2022
@thibo73800 thibo73800 self-requested a review February 21, 2022 14:55
@thibo73800 thibo73800 merged commit 3e967fd into master Feb 21, 2022
@thibo73800 thibo73800 deleted the loading_best branch March 10, 2022 09:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
alonet enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Load best model instead of last one
2 participants