ADAM on Slurm/LSF #1229

jpdna · 2016-10-27T15:11:06Z

I understand some people run Spark on their local Slurm (or LSF?) cluster like:
https://www.princeton.edu/researchcomputing/faq/spark-via-slurm/

It would be useful to provide instructions for this in our user guide, as Slurm/LSF is the cluster infrastructure that most bioinformatics users have access to.

Is there a slurm/LSF cluster at Berkeley I could try this on?

fnothaft · 2016-10-27T15:14:42Z

Is there a slurm/LSF cluster at Berkeley I could try this on?

Let me ask around...

heuermh · 2016-10-27T16:06:47Z

I'm not sure what the infrastructure actually is at the link above. The examples show submitting jobs to a Spark cluster using Slurm, not that Spark is actually running on the Slurm cluster.

There's another link on that page describing "Spark framework and the submission guidelines using YARN", but it doesn't say whether Spark via YARN is installed on the Slurm cluster or separately.

jpdna · 2016-10-27T16:43:15Z

The examples show submitting jobs to a Spark cluster using Slurm

Good point @heuermh
This may be more relevant and what I was thinking of - actually running executors as jobs on LSF/Slurm:

https://github.com/LLNL/magpie

In general, my intuition that when running Spark on HPC in this way all you would really lose is data locality, but otherwise an application like ADAM would run the same as it does on an HDFS cluster.

devin-petersohn · 2016-12-05T23:06:43Z

I have some experience running Spark on Slurm from the University of Missouri. They have a large cluster managed by Slurm that runs Spark.

In that case, we dynamically created Spark clusters using Slurm, so the entire environment was torn down at the end of the allocation. HDFS works the same way. For Adam on Slurm, I don't think there would be too many steps, aside from perhaps changing the SPARK_HOME (which we set dynamically). Since we are starting a collaboration, there may be an opportunity to use their cluster as a test case for this.

fnothaft · 2016-12-06T04:10:03Z

Since we are starting a collaboration, there may be an opportunity to use their cluster as a test case for this.

+1!

heuermh · 2017-08-29T17:38:37Z

Fixed by #1571

fnothaft added the documentation label Mar 3, 2017

fnothaft mentioned this issue Jun 26, 2017

Slurm deployment readme #1571

Merged

heuermh closed this as completed Aug 29, 2017

heuermh modified the milestone: 0.23.0 Aug 30, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ADAM on Slurm/LSF #1229

ADAM on Slurm/LSF #1229

jpdna commented Oct 27, 2016

fnothaft commented Oct 27, 2016

heuermh commented Oct 27, 2016 •

edited

Loading

jpdna commented Oct 27, 2016

devin-petersohn commented Dec 5, 2016

fnothaft commented Dec 6, 2016

heuermh commented Aug 29, 2017

ADAM on Slurm/LSF #1229

ADAM on Slurm/LSF #1229

Comments

jpdna commented Oct 27, 2016

fnothaft commented Oct 27, 2016

heuermh commented Oct 27, 2016 • edited Loading

jpdna commented Oct 27, 2016

devin-petersohn commented Dec 5, 2016

fnothaft commented Dec 6, 2016

heuermh commented Aug 29, 2017

heuermh commented Oct 27, 2016 •

edited

Loading