-
Notifications
You must be signed in to change notification settings - Fork 563
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Workflow not being executed in parallel on batch computing nodes (SLURM cluster grouped execution) #2339
Comments
Update: removing the
but why does this produce a serial execution? |
Seems like I'm experiencing the same issue reported here: #2060 |
sorry for looking to late into this issue - since Snakemake v8 the executor code for SLURM has its own repo. Does the issue persist for you after updating? |
I need to check again. Is this snakemake/snakemake-executor-plugin-slurm#29 resolved? |
I had a similar case and v7.32.3 had the same problem while v8.23.2 works as expected. |
I'm trying to write a profile to run this workflow on NERSC's supercomputer. Batch computing nodes have 128 CPU cores (x2 hyperthreads) and 512 GB of memory. Submission is managed through SLURM and maximum wall time is 12h.
My workflow is mostly composed by a large number of ~1h long, single-threaded jobs. I would like to instruct Snakemake to pack them efficiently and submit a much lower number of jobs to SLURM. Workflows running on a node should profit from all available resources and run in parallel.
This is what I've written so far:
And this is the relevant part of Snakemake's output:
And this is the content of that log file:
As you can see, jobs are serially executed on the node even if they are independent between each other.
What's wrong in my profile?
The text was updated successfully, but these errors were encountered: