Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiqc rules at single-sample mode wants to run everytime new samples are added to project #42

Open
ManavalanG opened this issue Sep 9, 2021 · 5 comments
Assignees

Comments

@ManavalanG
Copy link
Member

When new samples are added to a project, quac would create a temp multiqc config file, which is needed for multiqc rules. However as timestamp for this config file is newer, rules multiqc_by_sample_final_pass and multiqc_by_sample_initial_pass would also get triggered even for older samples for which quac is completed in single-sample-mode and don't need further processing. These unnecessary reruns should be avoided, ideally.

@ManavalanG
Copy link
Member Author

Potential solutions:

  • Ignore timestamp of config file in those rules. Problem is if and when we change those config files as part of quac repo, these runs will need to be rerun, but they would get ignored as config file's timestamps are not considered anymore.
  • Let those rules run every time. Cons are:
    • Unnecessary use of resources
    • Output files would not be available for a while
    • Issues would arise if a different user attempts to run quac at a later point due to file permission issues. If userB runs quac at a later point, they wouldn't have permissions to delete files created by userA.

@ManavalanG
Copy link
Member Author

Currently, as a workaround, we are deleting the conflicting files and then run quac for the existing projects. Command used:

cd /data/project/worthey_lab/projects

PROJECT=PROJECT_NAME
rm -rf ${PROJECT}/analysis/LW*/qc/quac_watch/  \
    ${PROJECT}/analysis/LW*/qc/multiqc_initial_pass/LW*_multiqc* \
    ${PROJECT}/analysis/LW*/qc/multiqc_final_pass/

@ManavalanG
Copy link
Member Author

After discussing with @wilkb777 and @Deeptha, it was decided that the current setup of rerunning multiqc for existing samples is acceptable when compared to the alternatives. If and when quac_watch threshold changes (between quac versions or user's custom threshold configs), multiqc and quac_watch will need to be rerun for all samples; otherwise there would be variability in thresholds used to obtain quac_watch results.

To circumvent current problems though, following will be implemented:

  • Remove usage of protected() tag in quac.
  • Use snakemake's onsuccess and onerror features to modify group permissions of quac_watch and multiqc files (both sample and project level) to allow group write privileges.

@ManavalanG
Copy link
Member Author

changed due date to October 14, 2021

@ManavalanG
Copy link
Member Author

changed due date to October 28, 2021

@ManavalanG ManavalanG self-assigned this Jan 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant