Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding instructions for scheduling with SLURM and sample SLURM script #102

Merged
merged 6 commits into from
Jan 25, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
44 changes: 43 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ Or with docker:
docker run --rm -ti -p 3000:3000 -p 8080:8080 opendronemap/clusterodm [parameters]
```

Or with apptainer:
Or with apptainer, after cd into ClusterODM directory:

```bash
apptainer run docker://opendronemap/clusterodm [parameters]
Expand Down Expand Up @@ -94,6 +94,48 @@ A docker-compose file is available to automatically setup both ClusterODM and No
docker-compose up
```

## HPC set up with SLURM

You can write a SLURM script to schedule and set up available nodes with NodeODM for the ClusterODM to be wired to if you are on the HPC. Using SLURM will decrease the amount of time and processes needed to set up nodes for ClusterODM each time. This provides an easier way for user to use ODM on the HPC.

To setup HPC with SLURM, you must make sure SLURM is installed.

SLURM script will be different from cluster to cluster, depending on which nodes in the cluster that you have. However, the main idea is we want to run NodeODM on each node once, and by default, each NodeODM will be running on port 3000. Apptainer will be taking available ports starting from port 3000, so if your node's port 3000 is open, by default NodeODM will be run on that node. After that, we want to run ClusterODM on the head node and connect the running NodeODMs to the ClusterODM. With that, we will have a functional ClusterODM running on HPC.

Here is an example of SLURM script assigning nodes 48, 50, 51 to run NodeODM. You can freely change and use it depending on your system:

![image](https://user-images.githubusercontent.com/70782465/214411148-cdf43e44-9756-4115-9195-d1f36b3a31b9.png)

You can check for available nodes using sinfo:

```
sinfo
```

Run the following command to schedule using the SLURM script:

```
sbatch sample.slurm
```

You can also check for currently running jobs using squeue:

```
squeue -u $user
```

Unfortunately, SLURM does not handle assigning jobs to the head node. Hence, if we want to run ClusterODM on the head node, we have to run it locally. After that, you can connect to the CLI and wire the NodeODMs to the ClusterODMs. Here is an example following the sample SLURM script:

```
telnet localhost 8080
> NODE ADD node48 3000
> NODE ADD node50 3000
> NODE ADD node51 3000
> NODE LIST
```

You should always check to make sure which ports are being used to run NodeODM if ClusterODM is not wired correctly.

## Windows Bundle

ClusterODM can run as a self-contained executable on Windows without the need for additional dependencies. You can download the latest `clusterodm-windows-x64.zip` bundle from the [releases](https://github.com/OpenDroneMap/ClusterODM/releases) page. Extract the contents in a folder and run:
Expand Down
19 changes: 19 additions & 0 deletions sample.slurm
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
#!/usr/bin/bash
#source .bashrc

#SBATCH --partition=8core
#SBATCH --nodelist=node[48,50,51]
#SBATCH --time=20:00:00

cd $HOME
cd ODM/NodeODM/

#Launched on Node 48
srun --nodes=1 apptainer run --writable node/ &

#Launch on node 50
srun --nodes=1 apptainer run --writable node/ &

#Launch on node 51
srun --nodes=1 apptainer run --writable node/ &
wait