Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A simple alternative profile with a single config file #73

Closed
jdblischak opened this issue May 4, 2021 · 6 comments
Closed

A simple alternative profile with a single config file #73

jdblischak opened this issue May 4, 2021 · 6 comments

Comments

@jdblischak
Copy link

jdblischak commented May 4, 2021

The profile in this repo is very comprehensive and handles many different use cases. However, I was having difficulty customizing it, especially when I was trying to replicate the behavior of the deprecated --cluster-config option (see the discussion in #25).

I've put together a simple profile that only requires you to download and edit a single config.yaml file:

https://github.com/jdblischak/smk-simple-slurm

If you're having difficultly specifying options to pass to sbatch, e.g. including the rule name in the log filename, specifying different time limits per rule, etc., please give it a try. My simple template can solve previous issues in this repo without resorting to using the deprecated --cluster-config option, e.g. #7, #24, #40, #42, #46

@hans-vg
Copy link

hans-vg commented May 11, 2021

This is excellent!! I was having a hard time getting output/error logs working with the CookieCutter slurm configuration. It would default to slurm-######.out, even though it was configured in a seperate cluster_config.yml. Your config worked out of the box with minimal configuration.

One question: is there anyway to change the log filenames for {wildcard}?
ls -1 logs/trimmomatic_pe/
trimmomatic_pe-sample=FA,unit=rep3-2687041.out
trimmomatic_pe-sample=FB,unit=rep2-2687040.out

I would like to not use "=" or "," in the filenames, so it doesn't require escaping to view. IE. less logs/trimmomatic_pe/trimmomatic_pe-sample=FA,unit=rep3-2687041.out

Is there anyway to modify this behavior?

@jdblischak
Copy link
Author

This is excellent!!...Your config worked out of the box with minimal configuration.

@hans-vg Wonderful! The goal was minimal configuration, so I'm glad you were able to configure it quickly.

I would like to not use "=" or "," in the filenames, so it doesn't require escaping to view.

The = and , come directly from the {wildcards} value that Snakemake substitutes. The only way to modify this AFAIK would be to write your own Python function to reformat it. You could probably update the function format_wildcards in the file slurm_utils.py in this repository to do this. That function gets called by slurm-submit.py.

But of course I would recommend you stick with the simple route. Yes those = and , can be annoying when you're trying to look at the log files, but the alternative is having to maintain multiple Python scripts to submit your jobs to Slurm.

@percyfal
Copy link
Collaborator

Hi @jdblischak, thanks for the post. Your repo looks really neat! Is there any functionality that could be merged with this profile? In any case, I can add a link in the README to your repo if you want, so that users don't have to browse the issue list to find it :)

@jdblischak
Copy link
Author

thanks for the post. Your repo looks really neat!

@percyfal Thanks! My initial inspiration came from your repo, so thank you for your efforts to document and maintain the official snakemake slurm profile.

Is there any functionality that could be merged with this profile?

The biggest difference in approach between the 2 profiles is how to specify default and per-rule resources. Your profile uses --cluster-config, and mine uses a combination of default-resources and per-rule resources. The latter more closely couples the Snakemake file itself to the original scheduler that was used, but the work for a potential user that wants to execute the pipeline with a different scheduler would be similar in both cases. Either they'd to need to perform a search-and-replace of the JSON file passed to --cluster-config or a search-and-replace of the Snakemake file itself.

In any case, I can add a link in the README to your repo if you want, so that users don't have to browse the issue list to find it :)

That would be super appreciated. Thanks!

@percyfal
Copy link
Collaborator

@jdblischak I have added a link to your repo in #84. Also, I point out that the use of cluster-config is discouraged and that resources should be configured with snakemake CLI arguments in the profile configuration; in the end I don't think the difference between the two profiles is that great. I will still keep support for cluster-config for now, but probably deprecate it soon enough.

One lingering issue that I have to deal with is increasing job submission speed as scalability is currently hampered.

@jdblischak
Copy link
Author

@percyfal Thanks so much!

One lingering issue that I have to deal with is increasing job submission speed as scalability is currently hampered.

I'm happy to help test this out. If you make some changes that affect the job submission speed, please ping me and I can run my job submission benchmark on the updated profile.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants