Skip to content
This repository has been archived by the owner on Nov 3, 2023. It is now read-only.

Atomic saves #2760

Merged
merged 4 commits into from
Jun 22, 2020
Merged

Atomic saves #2760

merged 4 commits into from
Jun 22, 2020

Conversation

stephenroller
Copy link
Contributor

Patch description
When working with these very large files, we have to worry about our cluster scheduler preempting our jobs in the middle of saving. This change makes the saving of model files atomic by saving to a temporary file, and then renaming the file to the goal.

Needs to have some spurious changes stripped from it (I was trying to make saving atomic).

Testing steps
Manual testing. CI

parlai/utils/torch.py Outdated Show resolved Hide resolved
@stephenroller stephenroller marked this pull request as ready for review June 20, 2020 23:00
Copy link
Contributor

@dianaglzrico dianaglzrico left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

neat 🤩

@stephenroller stephenroller merged commit b7fb034 into master Jun 22, 2020
@stephenroller stephenroller deleted the fastsave branch June 22, 2020 17:16
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants