-
Notifications
You must be signed in to change notification settings - Fork 7
Running on Myriad cluster
Asif Tamuri edited this page Dec 19, 2024
·
10 revisions
The commands below should be executed on the login node of the cluster
- Download latest version for Linux AMD64
wget https://github.com/git-lfs/git-lfs/releases/download/v3.6.0/git-lfs-linux-amd64-v3.6.0.tar.gz
- Extract
tar xvf git-lfs-linux-amd64-v3.6.0.tar.gz
- Go into directory
cd git-lfs-3.6.0
- Make the directory for local binary installation
mkdir -p ~/.local/bin/
- This directory needs to be in your path (should already be, but just in case)
export PATH="$HOME/.local/bin:$PATH"
- Run the installation script
./install.sh --local
- Check it worked
git-lfs --version
(Adapted from git-lfs-install gist)
mkdir ~/thanzi
mkdir -p ~/Scratch/thanzi/TLOmodel-outputs
cd ~/thanzi
git clone https://github.com/UCL/TLOmodel.git
cd TLOmodel
- Check resource files have been downloaded properly (i.e. git-lfs worked)
- Load the Python module for the cluster
module load python3/3.11
- Outside of the TLOmodel source code directory, create the virtual environment
cd ~/thanzi
python -m venv venv-tlo
- Activate the virtual environment
source ~/thanzi/venv-tlo/bin/activate
- Install the TLOmodel requirements
cd ~/thanzi/TLOmodel
pip install -r requirements/dev.txt
pip install -e .
- Check it worked (it will be slow first time)
tlo
Create the following file submit-scenario.sh
in ~/thanzi
. You have to customise parts of it (lines commented ***
)
#!/bin/bash -l
############ JOB CONFIG
# *** Request for the most reasonable minimum you can, up to 72 hours. This specifies 24 hours
#$ -l h_rt=24:0:0
# Request 16GB of memory
#$ -l mem=16G
# *** Personal job name identifier
#$ -N testing_scenario
# *** Put in your username below
#$ -wd /home/<your UCL id>/Scratch/thanzi/TLOmodel-outputs
# *** Setup the job array: 1-(no. of draws * no. of runs) e.g. if 3 draws, 3 runs: 1-9
#$ -t 1-9
############ END OF JOB CONFIG
# *** Specify number of draws & runs
numberOfDraws=5
numberOfRuns=10
# make the output directory
taskNumber=$SGE_TASK_ID
thisRun=$(awk -v n=$taskNumber "BEGIN { for (i=0; i<$numberOfDraws; i++) for (j=0; j<$numberOfRuns; j++) if (++count==n) print i, j }")
thisRunPath=$(echo $thisRun | tr ' ' '/')
outputDir="$HOME/Scratch/thanzi/TLOmodel-outputs/${JOB_NAME}_${JOB_ID}/${thisRunPath}"
mkdir -p $outputDir
# Load and activate python environment
module load python3/3.11
source ~/thanzi/venv-tlo/bin/activate
cd ~/thanzi/TLOmodel
# *** Run the specified scenario
tlo scenario-run --draw $thisRun --output-dir $outputDir src/scripts/dev/scenarios/playing_22.py
tlo parse-log $outputDir
gzip $outputDir/*.log
cd ~/thanzi
qsub submit-scenario.sh
The qsub
command will print the job id e.g.:
$ qsub submit_array.sh
Your job-array 62125.1-9:1 ("testing_scenario") has been submitted
You can check the status of your jobs:
$ qstat
job-ID prior name user state submit/start at queue slots ja-task-ID
-----------------------------------------------------------------------------------------------------------------------
62125 0.00000 testing_scenario ucbtaut qw 12/19/2024 17:07:40 1 1-9:1
Eventually, the tasks will start running
$ qstat
job-ID prior name user state submit/start at queue slots ja-task-ID
-----------------------------------------------------------------------------------------------------------------------
62125 3.07768 testing_scenario ucbtaut r 12/19/2024 17:14:06 Bran@node-b00a-007 1 1
62125 3.07768 testing_scenario ucbtaut r 12/19/2024 17:14:06 Bran@node-b00a-007 1 2
62125 3.07768 testing_scenario ucbtaut r 12/19/2024 17:14:08 Bran@node-b00a-007 1 3
62125 3.07768 testing_scenario ucbtaut r 12/19/2024 17:14:08 Bran@node-b00a-007 1 4
62125 3.07768 testing_scenario ucbtaut r 12/19/2024 17:14:08 Bran@node-b00a-013.myriad.ucl. 1 5
...snip...
The results will be placed in ~/Scratch/thanzi/TLOmodel-outputs/testing_scenario_62125
, where the number at the end is the job id.
- Go to the job directory and zip up everything to download. Note setting the ID so the commands work.
cd ~/Scratch/thanzi/TLOmodel-outputs
- Set the job id as a variable
JOBID=62125
- Move the stdout and stderr files into the directory
mv testing_scenario.?${JOBID}* testing_scenario_${JOBID}
- Zip everything up
zip -r download.zip testing_scenario_${JOBID}
You can move data from the cluster to your local machine, using concepts in the Myriad help.
TLO Model Wiki