Lrsched missing step #4392

juderoque · 2022-03-03T00:26:22Z

Patch description
Follow up from #4384: Unrelaxed relaxed conditions in tests and modified scheduler logic to fit unrelaxed conditions.

changed self._number_training_updates < self.warmup_updates --> self._number_training_updates <= self.warmup_updates to hit the exact max_lr.
- In this case the lr doesn't anneal to 0 due to a missing step.
Modified stopping conditions in parlai/nn/lr_scheduler.py and parlai/scripts/train.py to allow the missing step to be taken
Modified test_lr_schedulers.py to step from 1 -> total_steps rather than 0 -> total_steps - 1 to match the behavior in lr_scheduler.py

Testing steps

parlai tm -t convai2 -m transformer/generator --lr-scheduler linear --warmup-updates 10 -lstep 1 -vstep 10000000 --max-lr-steps 100 --skip-generation True --warmup-rate 0.01 -lr 0.00001 --dict-file /tmp/test123.dict

should now show 100 steps instead of 99

Other information
Still not sure if this is the desired behavior, is there an implicit step (0) taken?

emilydinan

Thanks for fix --great work!!

Can you add some tests with max_lr != 1? Also can you add additional tests to check that LR < max_lr at output[warmup_updates]

juderoque · 2022-03-03T22:27:24Z

Tests added!

juderoque · 2022-03-03T23:14:20Z

I noticed that the step counter used by this function (https://github.com/facebookresearch/ParlAI/blob/main/parlai/nn/lr_scheduler.py#L294) starts out at 1. This means the first step of the warmup value is 1 step ahead of the specified value, and the first step of the regular scheduler is also one step ahead of the specified max_lr value. In this patch I have made it so the last step of the warmup scheduler hits the max_lr. An alternative behavior could be that the steps start from 0, the last step of the warmup scheduler is "one step before" the max_lr, and the regular scheduler starts out at max_lr. This distinction is important in the case where there are 0 warmup-updates because here we never actually the specified max_lr, rather start at one step after it.
Thoughts @stephenroller @emilydinan ?

Edit: it is possible that PyTorch's native behavior is to start the counter at 1, in which case I wouldn't say we mess with this

Edit 2: in this current patch, setting warmup-updates=1 has the behavior I'd intuitively expect from warmup-updates=0, warmup-updates=2 gives the behavior i'd expect from warmup-updates=1, etc.

stephenroller · 2022-03-03T23:23:27Z

Generally I prefer backwards compatibility even if it’s esoteric.

…onger defaulting to just 1

juderoque · 2022-03-04T16:34:45Z

Generally I prefer backwards compatibility even if it’s esoteric.

@stephenroller Do you think my adding the extra step in train_model and lr_schedulers ruin backwards compatibility?

emilydinan

great job @juderoque !

stephenroller · 2022-03-04T16:52:47Z

I defer to Emily. She's thought way more about this

stephenroller · 2022-03-23T04:20:28Z

Bump

stephenroller · 2022-03-23T04:20:42Z

(go ahead and merge main into this to fix tests, please)

stephenroller · 2022-04-02T19:26:35Z

Should we reconsider this?

github-actions · 2022-05-03T00:09:22Z

This PR has not had activity in 30 days. Closing due to staleness.

Jude Fernandes added 4 commits March 2, 2022 15:24

modified exit condition on train_steps to allow one more step

ef7915a

modified exit condition in train_model.py as well

e0db0cf

unrelaxed constraints

5533fc4

modified counter for stepping through scheduler (0 -> n-1) ---> (1 -> n)

3c4dd38

facebook-github-bot added the CLA Signed label Mar 3, 2022

juderoque requested review from emilydinan and stephenroller March 3, 2022 00:26

emilydinan reviewed Mar 3, 2022

View reviewed changes

Jude Fernandes added 2 commits March 3, 2022 14:17

added check that LR < max_lr at output[warmup_updates]

187ddee

added tests with max_lr set at 0.5 and 1.5

7c76a28

Jude Fernandes added 2 commits March 3, 2022 15:27

fixed typos

b62643f

changed 1 -> max_lr for checking warmup step size because we are no l…

e63430f

…onger defaulting to just 1

juderoque requested a review from emilydinan March 4, 2022 01:42

emilydinan approved these changes Mar 4, 2022

View reviewed changes

github-actions bot added the stale label May 3, 2022

github-actions bot closed this May 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lrsched missing step #4392

Lrsched missing step #4392

juderoque commented Mar 3, 2022 •

edited

Loading

emilydinan left a comment

juderoque commented Mar 3, 2022

juderoque commented Mar 3, 2022 •

edited

Loading

stephenroller commented Mar 3, 2022

juderoque commented Mar 4, 2022 •

edited

Loading

emilydinan left a comment

stephenroller commented Mar 4, 2022

stephenroller commented Mar 23, 2022

stephenroller commented Mar 23, 2022

stephenroller commented Apr 2, 2022

github-actions bot commented May 3, 2022

Lrsched missing step #4392

Lrsched missing step #4392

Conversation

juderoque commented Mar 3, 2022 • edited Loading

emilydinan left a comment

Choose a reason for hiding this comment

juderoque commented Mar 3, 2022

juderoque commented Mar 3, 2022 • edited Loading

stephenroller commented Mar 3, 2022

juderoque commented Mar 4, 2022 • edited Loading

emilydinan left a comment

Choose a reason for hiding this comment

stephenroller commented Mar 4, 2022

stephenroller commented Mar 23, 2022

stephenroller commented Mar 23, 2022

stephenroller commented Apr 2, 2022

github-actions bot commented May 3, 2022

juderoque commented Mar 3, 2022 •

edited

Loading

juderoque commented Mar 3, 2022 •

edited

Loading

juderoque commented Mar 4, 2022 •

edited

Loading