-
Notifications
You must be signed in to change notification settings - Fork 129
slurm_scheduler, dir_workspace: add isolated workspaces for Slurm #416
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## main #416 +/- ##
==========================================
+ Coverage 94.20% 94.28% +0.08%
==========================================
Files 66 67 +1
Lines 3690 3761 +71
==========================================
+ Hits 3476 3546 +70
- Misses 214 215 +1
Continue to review full report at Codecov.
|
f91cfbf
to
9b60e81
Compare
@d4l3k has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@d4l3k has updated the pull request. You must reimport the pull request before landing. |
@d4l3k has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
Summary: This adds a new `DirWorkspace` which will copy the current workspace to the directory for code isolation purposes and integrates it in with the `slurm` scheduler via the `job_dir` runopt. The job dir must not exist and will be created. The CWD will be located in that job_dir. * `.torchxignore` is used for excluding files from the workspace * `.torchxslurmjobdirs` is used to track where job directories and thus logs are located Pull Request resolved: #416 Test Plan: Slurm integ tests + unit tests Reviewed By: kiukchung Differential Revision: D34801126 Pulled By: d4l3k fbshipit-source-id: 7423897d4a372f524230d08bc681493c112ce383
This pull request was exported from Phabricator. Differential Revision: D34801126 |
Summary: This adds a new `DirWorkspace` which will copy the current workspace to the directory for code isolation purposes and integrates it in with the `slurm` scheduler via the `job_dir` runopt. The job dir must not exist and will be created. The CWD will be located in that job_dir. * `.torchxignore` is used for excluding files from the workspace * `.torchxslurmjobdirs` is used to track where job directories and thus logs are located Pull Request resolved: #416 Test Plan: Slurm integ tests + unit tests Reviewed By: kiukchung Differential Revision: D34801126 Pulled By: d4l3k fbshipit-source-id: d53f6f36ad76921289116ee3e8a7c05b0975e594
This pull request was exported from Phabricator. Differential Revision: D34801126 |
This adds a new
DirWorkspace
which will copy the current workspace to the directory for code isolation purposes and integrates it in with theslurm
scheduler via thejob_dir
runopt. The job dir must not exist and will be created. The CWD will be located in that job_dir..torchxignore
is used for excluding files from the workspace.torchxslurmjobdirs
is used to track where job directories and thus logs are locatedTest plan:
Slurm integ tests + unit tests