-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Run snakemakefile-no results #116
Comments
(base) lizhihua@lizhihua-T640:/media/lizhihua/software/metaGEM$ snakemake -s Snakefile --use-conda --rerun-incomplete --cores 28 -p all 1 1 1 Select jobs to execute... [Wed Nov 30 18:27:36 2022] Reason: Rules with a run or shell declaration but no output are always executed.
/home/lizhihua/miniconda3/lib/python3.8/site-packages/requests/init.py:89: RequestsDependencyWarning: urllib3 (1.25.11) or chardet (5.0.0) doesn't match a supported version! |
Hi Li, Please make sure you ask a question in your issue, otherwise I am not sure how I can help. It looks like you have succesfully quality filtered your reads, now you will want to assemble them.
Running this command will re-configure your Snakefile so that Best wishes, |
Dear Dr.Francisco,
Many thanks for your quick help! Your means I need make a step-by-step task to complete the procedure? I think the snakefile could complete all the procedures one time. Many thanks!
Best regards
Li Zhihua
获取 Outlook for iOS<https://aka.ms/o0ukef>
…________________________________
发件人: Francisco Zorrilla ***@***.***>
发送时间: Wednesday, November 30, 2022 7:02:14 PM
收件人: franciscozorrilla/metaGEM ***@***.***>
抄送: Li Zhihua ***@***.***>; Author ***@***.***>
主题: Re: [franciscozorrilla/metaGEM] Run snakemakefile-no results (Issue #116)
Hi Li,
Please make sure you ask a question in your issue, otherwise I am not sure how I can help.
Similarly, please include the commands that you are running and what you are trying to accomplish.
It looks like you have succesfully quality filtered your reads, now you will want to assemble them.
bash metaGEM.sh --task megahit
Running this command will re-configure your Snakefile so that the rule all generated your desired assembly output.
Have a look at the tutorial<https://github.com/franciscozorrilla/unseenbio_metaGEM> and other resources on the repo for additional information.
Best wishes,
Francisco
―
Reply to this email directly, view it on GitHub<#116 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AJI2BJIZP33LOUA35PJBA4TWK4X3NANCNFSM6AAAAAASPOQWHY>.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
Hi Li, Good question 💎 You are correct that Snakemake can resolve the dependencies between multiple steps in one go, however it gets a bit tricky with the wildcards going from sample IDs to genome/bin IDs. In practice, I generally run only one or two rules at a time, and then check the outputs, since at many points in the pipeline it does not make sense to continue if some of your jobs/samples failed. Especially with larger datasets, you can end up wasting a lot of computational resources by submitting jobs that are running on corrputed/incomplete data. So for example, if I had 10 raw samples, I may start with the following command:
This would submit quality filtering jobs first, and then the assembly jobs, since assembly requires the qfiltered reads. In theory, it should be possible to request to submit jobs like this at any point of the workflow, however the Snakefile rule inputs and outputs need to be carefully matched. For example, try running the following:
In the future please open a new issue if it is unrelated to the current one. Best, |
Dear Dr.Francisco,
Thank you very much for your detailed explanation! I know your thoughts! In fact, I have more than 100 samples and assembled by SPAdy. So if I want to submit the SPADy results instead of megahit,is it OK?
Best regards
Li Zhihua
获取 Outlook for iOS<https://aka.ms/o0ukef>
…________________________________
发件人: Francisco Zorrilla ***@***.***>
发送时间: Wednesday, November 30, 2022 7:40:07 PM
收件人: franciscozorrilla/metaGEM ***@***.***>
抄送: Li Zhihua ***@***.***>; Author ***@***.***>
主题: Re: [franciscozorrilla/metaGEM] Run snakemakefile-no results (Issue #116)
Hi Li, Good question 💎
You are correct that Snakemake can resolve the dependencies between multiple steps in one go, however it gets a bit tricky with the wildcards going from sample IDs to genome/bin IDs. In practice, I generally run only one or two rules at a time, and then check the outputs, since at many points in the pipeline it does not make sense to continue if some of your jobs/samples failed. Especially with larger datasets, you can end up wasting a lot of computational resources by submitting jobs that are running on corrputed/incomplete data.
So for example, if I had 10 raw samples, I may start with the following command:
bash metaGEM.sh --task megahit
This would submit quality filtering jobs first, and then the assembly jobs, since assembly requires the qfiltered reads.
In theory, it should be possible to request to submit jobs like this at any point of the workflow, however the Snakefile rule inputs and outputs need to be carefully matched.
For example, try running the following:
bash metaGEM.sh --task binRefine
In the future please open a new issue if it is unrelated to the current one.
Best,
Francisco
—
Reply to this email directly, view it on GitHub<#116 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AJI2BJNBDUUUIQHOERHOYMDWK44JPANCNFSM6AAAAAASPOQWHY>.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
Yes, if you already have assembled contigs then simply place them in the corresponding folders where |
Dear Dr.Francisco,
Thank you very much for your kind help! If I want to complete several tasks one time (as you say one go), the order is bash metaGEM.sh --task megahit binRefine ..... or change the cluster_config.json tasks number as below?
Best regards
Li Zhihua
[cid:99bdbd0f-005a-42bc-b139-4b2b161e4ad3]
…________________________________
发件人: Francisco Zorrilla ***@***.***>
发送时间: 2022年11月30日 19:40
收件人: franciscozorrilla/metaGEM ***@***.***>
抄送: Li Zhihua ***@***.***>; Author ***@***.***>
主题: Re: [franciscozorrilla/metaGEM] Run snakemakefile-no results (Issue #116)
Hi Li, Good question 💎
You are correct that Snakemake can resolve the dependencies between multiple steps in one go, however it gets a bit tricky with the wildcards going from sample IDs to genome/bin IDs. In practice, I generally run only one or two rules at a time, and then check the outputs, since at many points in the pipeline it does not make sense to continue if some of your jobs/samples failed. Especially with larger datasets, you can end up wasting a lot of computational resources by submitting jobs that are running on corrputed/incomplete data.
So for example, if I had 10 raw samples, I may start with the following command:
bash metaGEM.sh --task megahit
This would submit quality filtering jobs first, and then the assembly jobs, since assembly requires the qfiltered reads.
In theory, it should be possible to request to submit jobs like this at any point of the workflow, however the Snakefile rule inputs and outputs need to be carefully matched.
For example, try running the following:
bash metaGEM.sh --task binRefine
In the future please open a new issue if it is unrelated to the current one.
Best,
Francisco
—
Reply to this email directly, view it on GitHub<#116 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AJI2BJNBDUUUIQHOERHOYMDWK44JPANCNFSM6AAAAAASPOQWHY>.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
This is more of a general Snakemake question rather than a metaGEM specific issues, please have a look at the Snakemake documentation to understand how rule dependencies are resolved. If you ask the metaGEM Snakefile to produce refined bins (e.g. Lines 895 to 905 in 6285b93
As you can see, binRefine requires concoct, maxbin, and metabat output, so Snakemake will check if those files are present, if not then it will submit the binning jobs to produce those results, and then submit the binRefine jobs. However, Snakemake also has to check that for those binning tasks, the inputs are also present. For example have a look at the concoct rule: Lines 627 to 630 in 6285b93
So if your concoct inputs (i.e. assembly and coverage table) are missing, then Snakemake will submit jobs to generate those as well, and so on to resolve file dependencies between rules/tasks. If you run a command like this:
Then the |
Building DAG of jobs...
Using shell: /bin/bash
Provided cores: 28
Rules claiming more threads will be scaled down.
Job stats:
job count min threads max threads
all 1 1 1
total 1 1 1
Select jobs to execute...
[Wed Nov 30 18:04:13 2022]
Job 0:
WARNING: Be very careful when adding/removing any lines above this message.
The metaGEM.sh parser is presently hardcoded to edit line 22 of this Snakefile to expand target rules accordingly,
therefore adding/removing any lines before this message will likely result in parser malfunction.
Reason: Rules with a run or shell declaration but no output are always executed.
[Wed Nov 30 18:04:14 2022]
Finished job 0.
1 of 1 steps (100%) done
Complete log: .snakemake/log/2022-11-30T180413.551219.snakemake.log
The text was updated successfully, but these errors were encountered: