You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
After completion of Trainer.hyperparameter_search() attribute trainer.state.best_model_checkpoint references the last trained model instead of the best one
#23150
Closed
2 of 4 tasks
fantauzzi opened this issue
May 4, 2023
· 2 comments
Call Trainer.hyperparameter_search(); when it completes, attribute Trainer.state.best_model_checkpoint and other Trainer.state attributes reference the last trained model, in the sequence of models trained by Trainer.hyperparameter_search().
Note: to speed-up reproduction of the issue, I have limited the training dataset size in the provided code, line #49; that's why the evaluation metrics at the end of the hyperparameters search are poor.
Expected behavior
After Trainer.hyperparameter_search() completes, attribute Trainer.state.best_model_checkpoint should contain the filename of the checkpoint with the best model among all the models trained during hyperparameters search, not the last model; that is, the model trained during the run indicated in the BestRun instance returned by hyperparameter_search()
Likewise, other Trainer.state attributes should relate to the same model, e.g: Trainer.state.best_metric Trainer.state.epoch Trainer.state.global_step
The text was updated successfully, but these errors were encountered:
Hyperparameter search does not play well with the best model indeed. That's not something in our roadmap for fixing, but we are happy to look at any PR!
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
System Info
transformers
version: 4.28.1Who can help?
@sgugger
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
https://colab.research.google.com/drive/1Ht14ntTQy96-_zO-iVlwvAgkyY8t6vKc?usp=sharing
Call
Trainer.hyperparameter_search()
; when it completes, attributeTrainer.state.best_model_checkpoint
and otherTrainer.state
attributes reference the last trained model, in the sequence of models trained byTrainer.hyperparameter_search()
.Note: to speed-up reproduction of the issue, I have limited the training dataset size in the provided code, line #49; that's why the evaluation metrics at the end of the hyperparameters search are poor.
Expected behavior
After
Trainer.hyperparameter_search()
completes, attributeTrainer.state.best_model_checkpoint
should contain the filename of the checkpoint with the best model among all the models trained during hyperparameters search, not the last model; that is, the model trained during the run indicated in theBestRun
instance returned byhyperparameter_search()
Likewise, other
Trainer.state
attributes should relate to the same model, e.g:Trainer.state.best_metric
Trainer.state.epoch
Trainer.state.global_step
The text was updated successfully, but these errors were encountered: