v0.7.10
Interactive vs Batch Jobs
When the interactive
flag is true
, we launch via the srun
command. Otherwise, we pipe the bash script into a file, and launch using sbatch
.
Launching Multiple Jobs (Interactive)
Just call jaynes.run
with multiple copies of the function:
#! ./multi_launch.py
import jaynes
from launch_entry import train_fn
if __name__ == "__main__":
jaynes.config(verbose=False)
for i in range(3):
jaynes.run(train_fn, seed=i * 100)
jaynes.listen(200)
And the output should be 3 streams of stdout pipe-back combined together, running in parallel.
/Users/ge/opt/anaconda3/envs/plan2vec/bin/python /Users/ge/mit/jaynes-starter-kit/04_slurm_configuration/launch_entry.py
Jaynes pipe-back is now listening...
Running on login-node
Running inside worker
Running on login-node
Running on login-node
Running on login-node
Running inside worker
[seed: 200] See real-time pipe-back from the server:
[seed: 200] step: 0
[seed: 200] step: 1
Running inside worker
[seed: 200] step: 2
Running inside worker
[seed: 200] Finished!
[seed: 300] See real-time pipe-back from the server:
[seed: 300] step: 0
[seed: 300] step: 1
[seed: 300] step: 2
[seed: 100] See real-time pipe-back from the server:
[seed: 100] step: 0
[seed: 100] step: 1
[seed: 100] step: 2
[seed: 300] Finished!
[seed: 100] Finished!
[seed: 0] See real-time pipe-back from the server:
[seed: 0] step: 0
[seed: 0] step: 1
[seed: 0] step: 2
[seed: 0] Finished!
Launching Sequential Jobs with SBatch
To submit a sequence of jobs with sbatch,
- turn off the
interactive
mode by setting it tofalse
. - specify
n_seq_jobs
to be > 1 (default:null
). - make sure you set a job name, because otherwise, all of your
sbatch
calls will be sequentially ordered.
For example, .jaynes.yml
may look like:
- !runners.Slurm &slurm
envs: >-
LC_CTYPE=en_US.UTF-8 LANG=en_US.UTF-8 LANGUAGE=en_US
startup: >-
source /etc/profile.d/modules.sh
source $HOME/.bashrc
interactive: false
n_seq_jobs: 3
Then, just call jaynes.run(train_fn)
once:
#! ./seq_jobs_launch.py
import jaynes
from launch_entry import train_fn
if __name__ == "__main__":
for index in range(10):
jaynes.config(verbose=False, launch=dict(job_name=f"unique-job-{index}"))
jaynes.run(train_fn)
This runs sbatch --job-name unique-job-0 -d singleton ...
for n_seq_jobs=3
times, which requests sequential jobs.