Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PCGrad with LR schedules, resume from checkpoint with LR schedules #145

Merged
merged 27 commits into from
Sep 22, 2022

Conversation

erwulff
Copy link
Collaborator

@erwulff erwulff commented Sep 20, 2022

The legacy optimizer, which PCGrad is based on, does not increment the iterations attribute of the optimizer when optimizer.apply_gradients() is called, causing keras LearningRateSchedules to not work properly. This PR fixes that, making it possible to use LR schedules with PCGrad as well as to resume from checkpoints while keeping the schedule in order.

erwulff and others added 27 commits July 9, 2021 11:11
Merge new commits from jpata:master
Merge from jpata/particleflow master
Merge jpata/master into master
Merge latest developments
merge jpata/particleflow master
@erwulff erwulff marked this pull request as ready for review September 21, 2022 07:30
@erwulff
Copy link
Collaborator Author

erwulff commented Sep 21, 2022

  • Tested that LR schedules continue where they left of when resuming from saved weights.
  • Currently running a training to check that performance is not affected by the modification to PCGrad.

@jpata
Copy link
Owner

jpata commented Sep 21, 2022

nice, this looks like a good fix. once you confirm it works, we can merge it!

@erwulff
Copy link
Collaborator Author

erwulff commented Sep 22, 2022

I confirm it works. Training with PCGrad after this mod gives expected results. Ready for merge.

@jpata jpata merged commit fc0e482 into jpata:main Sep 22, 2022
jpata pushed a commit that referenced this pull request Sep 15, 2023
)

* feat: PCGrad now works with keras LearningRateSchedule

* feat: OneCycle LR and mom scheduler supports resuming from checkpoint

* chore: add jobid to pipeline train sbatch script

Former-commit-id: fc0e482
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants