Add backwards-compatibility logic for model progress tracker #2468

jeffreyftang · 2022-09-09T17:41:22Z

Convert epoch-based stats into steps-based stats for compatibility with Move from epoch-based training and evaluation to step-based training and evaluation. #1803. Epoch to steps conversion is approximate, as we don't have access to the full history of any batch size increases.
Reformat train/test/validation metrics into TrainerMetric tuples, also with estimated epoch and step values.

tgaddair

Tests

github-actions · 2022-09-09T18:42:03Z

Unit Test Results

        6 files ±0       6 suites ±0 3h 6m 42s ⏱️ + 13m 9s
  3 378 tests +2 3 300 ✔️ +4   78 💤 ±0 0 ❌ - 2
10 134 runs +6 9 876 ✔️ +8 258 💤 ±0 0 ❌ - 2

Results for commit 306fb01. ± Comparison against base commit e60626f.

♻️ This comment has been updated with latest results.

justinxzhao · 2022-09-09T19:39:39Z

tests/ludwig/utils/test_backward_compatibility.py

+    for stat in ("improvement", "increase_batch_size", "learning_rate_reduction"):
+        assert f"last_{stat}_epoch" not in new_model_progress
+        assert f"last_{stat}_steps" in new_model_progress
+        assert (
+            new_model_progress[f"last_{stat}_steps"]
+            == old_model_progress[f"last_{stat}_epoch"] * old_model_progress["batch_size"]
+        )
+
+    assert "tune_checkpoint_num" in new_model_progress
+
+    assert "vali_metrics" not in new_model_progress
+    assert "validation_metrics" in new_model_progress
+
+    metric = new_model_progress["validation_metrics"]["combined"]["loss"][0]
+    assert len(metric) == 3
+    assert metric[-1] == 0.59
+
+    # Verify that we don't make changes to already-valid model progress dicts.
+    # To do so, we modify the batch size value and re-run the upgrade on the otherwise-valid `new_model_progress` dict.
+    new_model_progress["batch_size"] = 1
+    unchanged_model_progress = upgrade_model_progress(new_model_progress)
+
+    for stat in ("improvement", "increase_batch_size", "learning_rate_reduction"):
+        assert unchanged_model_progress[f"last_{stat}_steps"] == new_model_progress[f"last_{stat}_steps"]
+
+    unchanged_metric = unchanged_model_progress["validation_metrics"]["combined"]["loss"][0]
+    new_metric = new_model_progress["validation_metrics"]["combined"]["loss"][0]
+    assert unchanged_metric == new_metric


I find that it's easier to read a single check for equality on the entire dictionary, wholesale. What do you think?

Agreed, just need to specify an expected config dict.

jeffreyftang added 2 commits September 9, 2022 12:14

Add backwards-compatibility logic for model progress tracker

3c4a896

refine

d7f790a

jeffreyftang requested a review from justinxzhao September 9, 2022 17:41

tgaddair requested changes Sep 9, 2022

View reviewed changes

jeffreyftang added 2 commits September 9, 2022 13:51

more testing

a4fe865

more test

24c607e

justinxzhao reviewed Sep 9, 2022

View reviewed changes

compare full dict

306fb01

jeffreyftang requested review from tgaddair and justinxzhao September 9, 2022 20:16

tgaddair approved these changes Sep 10, 2022

View reviewed changes

tgaddair merged commit 877635f into master Sep 10, 2022

tgaddair deleted the progtracker-bcomp branch September 10, 2022 17:06

tgaddair pushed a commit that referenced this pull request Sep 11, 2022

Add backwards-compatibility logic for model progress tracker (#2468)

fce5c82

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add backwards-compatibility logic for model progress tracker #2468

Add backwards-compatibility logic for model progress tracker #2468

jeffreyftang commented Sep 9, 2022

tgaddair left a comment

github-actions bot commented Sep 9, 2022 •

edited

Loading

justinxzhao Sep 9, 2022

tgaddair Sep 9, 2022

jeffreyftang Sep 9, 2022

Add backwards-compatibility logic for model progress tracker #2468

Add backwards-compatibility logic for model progress tracker #2468

Conversation

jeffreyftang commented Sep 9, 2022

tgaddair left a comment

Choose a reason for hiding this comment

github-actions bot commented Sep 9, 2022 • edited Loading

Unit Test Results

justinxzhao Sep 9, 2022

Choose a reason for hiding this comment

tgaddair Sep 9, 2022

Choose a reason for hiding this comment

jeffreyftang Sep 9, 2022

Choose a reason for hiding this comment

github-actions bot commented Sep 9, 2022 •

edited

Loading