Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

torch checkpoint creation should use storage class methods #126

Merged
merged 1 commit into from
Dec 8, 2023

Conversation

krehm
Copy link
Contributor

@krehm krehm commented Dec 8, 2023

Checkpoint folders and files are to be created relative to the storage_root of the storage class used for the run. Code in torch_framework.py for checkpoint_folder and checkpoint file creation is hard-coded using posix calls which may not be appropriate for future storage classes, and the checkpoint_folder is not relative to the storage_root. The checkpoint() method has been changed to use storage class methods.

Checkpoint folders and files are to be created relative to the
storage_root of the storage class used for the run.  Code in
torch_framework.py for checkpoint_folder and checkpoint file
creation is hard-coded using posix calls which may not be
appropriate for future storage classes, and the checkpoint_folder
is not relative to the storage_root.  The checkpoint() method
has been changed to use storage class methods.
@zhenghh04 zhenghh04 merged commit 2e324cf into argonne-lcf:main Dec 8, 2023
@krehm krehm deleted the bugfix/torch_checkpoint branch December 8, 2023 21:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants