Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disable saving checkpoints if not trained #4372

Merged
merged 8 commits into from
Nov 3, 2020
Merged

Conversation

rohitgr7
Copy link
Contributor

@rohitgr7 rohitgr7 commented Oct 26, 2020

What does this PR do?

Fixes #4176
Follow up of #4291 . The trainer still saves the checkpoint which it should not if not trained. One test is failing now because that test expects new checkpoints to be created.
cc @tchaton

Before submitting

  • Was this discussed/approved via a Github issue? (no need for typos and docs improvements)
  • Did you read the contributor guideline, Pull Request section?
  • Did you make sure your PR does only one thing, instead of bundling different changes together? Otherwise, we ask you to create a separate PR for every change.
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests?
  • Did you verify new and existing tests pass locally with your changes?
  • If you made a notable change (that affects users), did you update the CHANGELOG?

PR review

  • Is this pull request ready for review? (if not, please submit in draft mode)

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

Did you have fun?

Make sure you had fun coding 🙃

@mergify mergify bot requested a review from a team October 26, 2020 14:03
@rohitgr7 rohitgr7 changed the title Disable saving checkpoints if not trained Disable saving checkpoints if not trained [ci skip] Oct 26, 2020
@SeanNaren SeanNaren added bug Something isn't working checkpointing Related to checkpointing labels Oct 26, 2020
@rohitgr7 rohitgr7 added this to the 1.0.x milestone Oct 26, 2020
@edenlightning
Copy link
Contributor

@tchaton plz fix tests :)

@codecov
Copy link

codecov bot commented Oct 30, 2020

Codecov Report

Merging #4372 into master will decrease coverage by 0%.
The diff coverage is 100%.

@@          Coverage Diff           @@
##           master   #4372   +/-   ##
======================================
- Coverage      92%     92%   -0%     
======================================
  Files         116     116           
  Lines        8700    8699    -1     
======================================
- Hits         8047    8043    -4     
- Misses        653     656    +3     

@rohitgr7 rohitgr7 force-pushed the bugfix/mc_rep_checkpoint branch from 8e3628f to 5a41048 Compare October 30, 2020 21:46
@rohitgr7 rohitgr7 marked this pull request as draft October 31, 2020 12:00
@rohitgr7 rohitgr7 marked this pull request as ready for review October 31, 2020 13:02
@rohitgr7
Copy link
Contributor Author

@tchaton need your review.

@rohitgr7 rohitgr7 changed the title Disable saving checkpoints if not trained [ci skip] Disable saving checkpoints if not trained Oct 31, 2020
Copy link
Contributor

@tchaton tchaton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great catch there ! Thanks for your contributions

@rohitgr7 rohitgr7 merged commit ad2556b into master Nov 3, 2020
@rohitgr7 rohitgr7 deleted the bugfix/mc_rep_checkpoint branch November 3, 2020 06:08
Borda pushed a commit that referenced this pull request Nov 3, 2020
* Disable saving checkpoints if not trained

* chlog

* update test

* fix

Co-authored-by: chaton <thomas@grid.ai>
(cherry picked from commit ad2556b)
Borda pushed a commit that referenced this pull request Nov 4, 2020
* Disable saving checkpoints if not trained

* chlog

* update test

* fix

Co-authored-by: chaton <thomas@grid.ai>
(cherry picked from commit ad2556b)
rohitgr7 added a commit that referenced this pull request Nov 21, 2020
* Disable saving checkpoints if not trained

* chlog

* update test

* fix

Co-authored-by: chaton <thomas@grid.ai>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working checkpointing Related to checkpointing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Issue with epoch count with repeated save/restore
7 participants