Skip to content

Commit

Permalink
Moved Slurm docs to existing 40_deploying_ADAM.md and addressed comments
Browse files Browse the repository at this point in the history
  • Loading branch information
Paschall authored and heuermh committed Jul 5, 2017
1 parent 12aaa7b commit c157ab8
Show file tree
Hide file tree
Showing 2 changed files with 78 additions and 67 deletions.
79 changes: 78 additions & 1 deletion docs/source/40_deploying_ADAM.md
Original file line number Diff line number Diff line change
Expand Up @@ -512,4 +512,81 @@ def _count_child(job, masterHostname):
# and now, count it
assert rdd.count() == 10000
```
```

## Running ADAM on Slurm

For those groups with access to a HPC cluster with [Slurm](https://en.wikipedia.org/wiki/Slurm_Workload_Manager)
managing a number of compute nodes with local and/or network attached storage, it is possible to spin up a
temporary Spark cluster for use by ADAM.

While the full IO bandwidth benefits of Spark processing are likely best realized through a set of
co-located compute/storage nodes, depending on your network setup you may find Spark deployed on HPC
to be a workable solution for testing or even production at scale, especially for those applications
which perform multiple in-memory transformations and thus benefit from Spark's in-memory processing model.

Follow the primary [instructions](https://github.com/bigdatagenomics/adam/blob/master/docs/source/02_installation.md)
for installing ADAM into `$ADAM_HOME`

### Start Spark cluster

A Spark cluster can be started as a muti-node job in Slurm by creating a job file `run.cmd` such as below:
```
#!/bin/bash
#SBATCH --partition=multinode
#SBATCH --job-name=spark-multi-node
#SBATCH --exclusive
#Number of seperate nodes reserved for Spark cluster
#SBATCH --nodes=2
#SBATCH --cpus-per-task=12
#Number of excecution slots
#SBATCH --ntasks=2
#SBATCH --time=05:00:00
#SBATCH --mem=248g
# If your sys admin has installed spark as a module
module load spark
# If spark is not installed as a module, you will need to specifiy absolute path to $SPARK_HOME/bin/spark-start
start-spark
echo $MASTER
sleep infinity
```
submit the job file to Slurm:
```
sbatch run.cmd
```

This will start a Spark cluster containing 2 nodes that persists for 5 hours, unless you kill it sooner.
The `slurm.out` file created in the current directory will contain a line produced by `echo $MASTER`
above which willindicate the address of the Spark master to which your application or ADAM-shell
should connect such as `spark://somehostname:7077`

### Start ADAM Shell
Your sys admin will probably prefer that you launch your ADAM-shell or start an application from a
cluster node rather than the head node you log in to so you may want to do so with:
```
sinteractive
```

Start an adam-shell as so:
```
$ADAM_HOME/bin/adam-shell --master spark://hostnamefromslurmdotout:7077
```

### or Run a batch job with adam-submit
```
$ADAM_HOME/bin/adam-submit --master spark://hostnamefromslurmdotout:7077
```

You should be able to connect to the Spark Web UI at `spark://hostnamefromslurmdotout:4040`, however
you may need to ask your local sys admin to open the requried ports.

### Feedback
We'd love to hear feedback on your experience running ADAM on HPC/Slurm or other deployment architectures,
and let us know of any problems you run into via the mailing list or Gitter.
66 changes: 0 additions & 66 deletions docs/source/80_slurm.md

This file was deleted.

0 comments on commit c157ab8

Please sign in to comment.