-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for fix --great work!!
Can you add some tests with max_lr
!= 1? Also can you add additional tests to check that LR < max_lr at output[warmup_updates]
Tests added! |
I noticed that the step counter used by this function (https://github.com/facebookresearch/ParlAI/blob/main/parlai/nn/lr_scheduler.py#L294) starts out at 1. This means the first step of the warmup value is 1 step ahead of the specified value, and the first step of the regular scheduler is also one step ahead of the specified max_lr value. In this patch I have made it so the last step of the warmup scheduler hits the max_lr. An alternative behavior could be that the steps start from 0, the last step of the warmup scheduler is "one step before" the max_lr, and the regular scheduler starts out at max_lr. This distinction is important in the case where there are 0 warmup-updates because here we never actually the specified max_lr, rather start at one step after it. Edit: it is possible that PyTorch's native behavior is to start the counter at 1, in which case I wouldn't say we mess with this Edit 2: in this current patch, setting warmup-updates=1 has the behavior I'd intuitively expect from warmup-updates=0, warmup-updates=2 gives the behavior i'd expect from warmup-updates=1, etc. |
Generally I prefer backwards compatibility even if it’s esoteric. |
…onger defaulting to just 1
@stephenroller Do you think my adding the extra step in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great job @juderoque !
I defer to Emily. She's thought way more about this |
Bump |
(go ahead and merge main into this to fix tests, please) |
Should we reconsider this? |
This PR has not had activity in 30 days. Closing due to staleness. |
Patch description
Follow up from #4384: Unrelaxed relaxed conditions in tests and modified scheduler logic to fit unrelaxed conditions.
self._number_training_updates < self.warmup_updates
-->
self._number_training_updates <= self.warmup_updates
to hit the exactmax_lr
.parlai/nn/lr_scheduler.py
andparlai/scripts/train.py
to allow the missing step to be takentest_lr_schedulers.py
to step from 1 -> total_steps rather than 0 -> total_steps - 1 to match the behavior inlr_scheduler.py
Testing steps
should now show 100 steps instead of 99
Other information
Still not sure if this is the desired behavior, is there an implicit step (0) taken?