Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GBM DART boosting type incopatible with early stopping #2964

Merged
merged 2 commits into from
Jan 19, 2023

Conversation

jeffkinnison
Copy link
Contributor

@jeffkinnison jeffkinnison commented Jan 19, 2023

Setting boosting_type: dart leads to the following error iff an epoch ends with no model improvement:

File "/home/ray/anaconda3/lib/python3.8/site-packages/ludwig/trainers/trainer_lightgbm.py", line 456, in check_progress_on_validation
    last_improvement_in_steps = progress_tracker.steps - progress_tracker.best_eval_metric_steps
TypeError: unsupported operand type(s) for -: 'int' and 'NoneType'

This error is caused by an inconsistency in updating progress_tracker.best_eval_metric_steps coupled with DART's incompatibility with the LGBM early stopping callback. LGBM models keep track of the best observed step in model.best_iteration_, however this only happens when the early stopping callback is used; otherwise, best_iteration_ is always None. Since DART does not use early stopping, the latter condition seems to be true. The progress_tracker update can then set the step tracker to None and subsequently not update it with Ludwig-level tracking if the model did not improve

This update

  • removes the LGBM early stopping callback from training when using boosting_type: dart
  • reorganizes progress_tracker updates to use LGBM internal tracking if available with a fallback to Trainer-level tracking
  • adds an integration test for DART

Copy link
Contributor

@justinxzhao justinxzhao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the detailed PR description.

@github-actions
Copy link

Unit Test Results

         6 files  ±0           6 suites  ±0   5h 29m 27s ⏱️ + 49m 18s
  3 897 tests +1    3 825 ✔️ +1    72 💤 ±0  0 ±0 
11 688 runs  +3  11 472 ✔️ +3  216 💤 ±0  0 ±0 

Results for commit 0b0af0c. ± Comparison against base commit 28e3a8e.

@tgaddair tgaddair merged commit cb498c8 into master Jan 19, 2023
@tgaddair tgaddair deleted the dart-no-early-stopping branch January 19, 2023 23:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants