-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add backwards-compatibility logic for model progress tracker #2468
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tests
for stat in ("improvement", "increase_batch_size", "learning_rate_reduction"): | ||
assert f"last_{stat}_epoch" not in new_model_progress | ||
assert f"last_{stat}_steps" in new_model_progress | ||
assert ( | ||
new_model_progress[f"last_{stat}_steps"] | ||
== old_model_progress[f"last_{stat}_epoch"] * old_model_progress["batch_size"] | ||
) | ||
|
||
assert "tune_checkpoint_num" in new_model_progress | ||
|
||
assert "vali_metrics" not in new_model_progress | ||
assert "validation_metrics" in new_model_progress | ||
|
||
metric = new_model_progress["validation_metrics"]["combined"]["loss"][0] | ||
assert len(metric) == 3 | ||
assert metric[-1] == 0.59 | ||
|
||
# Verify that we don't make changes to already-valid model progress dicts. | ||
# To do so, we modify the batch size value and re-run the upgrade on the otherwise-valid `new_model_progress` dict. | ||
new_model_progress["batch_size"] = 1 | ||
unchanged_model_progress = upgrade_model_progress(new_model_progress) | ||
|
||
for stat in ("improvement", "increase_batch_size", "learning_rate_reduction"): | ||
assert unchanged_model_progress[f"last_{stat}_steps"] == new_model_progress[f"last_{stat}_steps"] | ||
|
||
unchanged_metric = unchanged_model_progress["validation_metrics"]["combined"]["loss"][0] | ||
new_metric = new_model_progress["validation_metrics"]["combined"]["loss"][0] | ||
assert unchanged_metric == new_metric |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find that it's easier to read a single check for equality on the entire dictionary, wholesale. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, just need to specify an expected config dict.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
TrainerMetric
tuples, also with estimated epoch and step values.