Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Removing use of conda env for tasks within snakemake pipeline #69

Closed
ManavalanG opened this issue Apr 10, 2023 · 6 comments
Closed

Removing use of conda env for tasks within snakemake pipeline #69

ManavalanG opened this issue Apr 10, 2023 · 6 comments
Labels
enhancement New feature or request

Comments

@ManavalanG
Copy link
Member

Currently, QuaC pipeline uses singularity+conda set up for most rules and just singularity for minority of rules. We have questioned in the past whether we should move away from singularity+conda set up, where possible, in order to further reproducibility. While I follow them for the newer pipelines, older pipelines (including QuaC) were not refactored in favor of this setup.

As part of a QuaC related error (#68), a reviewer suggested it might be beneficial to move to container-only setup instead of container+conda setup. While it is yet to be confirmed that nature of that error was truly due to conda usage, we decided to refactor QuaC to just use singularity.

PS - While singularity+conda setup has mostly been successful for us, it has occasionally created trouble when conda solved dependencies. #49 was to reduce/avoid such issues, but if my memory serves right, it still caused occasional issues. Hence the decision to try out to move to singularity only setup, now that QuaC is publicly available, and this might help to avoid such bugs from external users.

@ManavalanG ManavalanG added bug Something isn't working enhancement New feature or request labels Apr 10, 2023
@ManavalanG
Copy link
Member Author

See move_to_biocontainers for this refactoring.

@ManavalanG
Copy link
Member Author

We use snakemake-wrappers for five rules in quac, and it appears snakemake-wrappers are not set up to work purely with the containers; instead it appears to require conda or container+ conda set up.

Solution: I moved away from snakemake-wrappers and explicitly defined shell commands for those five rules.

@ManavalanG
Copy link
Member Author

Rules create_multiqc_config and quac_watch have multi-tool dependencies defined in conda env, unlike other rules which have single-tool dependency.

To workaround this, I have added a PR (BioContainers/multi-package-containers#2583) to create mulled container image and then get the address where biocontainers would host the container. Will have to wait for the PR to be merged to use and try out the container based solution for these two rukes.

@ManavalanG
Copy link
Member Author

ManavalanG commented Apr 10, 2023

BioContainers/multi-package-containers#2583 was successfully merged and docker image was created successfully. Took ~2hrs from the time PR was created. Identified the path to the image by clicking actions in their github repo -->
Merge pull request #2583 from ManavalanG/master` --> build-and-test --> upload images.

@ManavalanG
Copy link
Member Author

To check if commands remained the same before and after wrapper directive usage was removed, I compared the log files for system testing test_project_2samples_wgs-no_priorQC. Commands before and after remained the same. So it was a success.

@ManavalanG
Copy link
Member Author

As the rule aggregate_sample_rename_configs used run directive, it could not be run in the singularity container of its own. So I refactored it to use shell directive.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant