-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update krakenuniq #550
Update krakenuniq #550
Conversation
The KrakenUniq module has a breaking change requiring a further input.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM (and tested) once the official module is updated
152aade
to
a1a8289
Compare
@jfy133 did you by any chance test this also without run merging? I'm still a bit in doubt about the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ONe thing I didn't check is if Taxpasta is still OK with this (no missing data), I'm assuming as it didn't fail it's OK but can you confirm from your tests @Midnighter ?
Oh haha, just saw your question! will see if I can run again (DB wifi though) |
How do you think taxpasta would be affected? I don't see a potential risk right now? |
Same issue as you were referring to really - file names. I'm stuck on porechop_ABI for some reason, but suspect it's waiting for something to download still :grimace: |
The taxpasta module uses a proper prefix definition, so I there should be no issues, there. def prefix = task.ext.prefix ?: "${meta.id}" |
Currently doesn't work because of a file collision when there are SE and PE data in one run and run merging is not performed, as it results in file collisions. I think it needs a mdoules.conf update that a KrakenUniq SE run has |
Unfortunately, can't use module config due to the batching :( but I'll update the pipeline channel logic. |
Maybe embed it in the module? I think it it reasonable in this case |
As discussed on Slack huddle: we don't need to embed in the module, just update the list of prefixes coming in via the new input channel for the module to add |
In the rare case that there are two samples with the same identifier and run accession, one being single-end and the other paired-end, this change makes the prefix and thus the KrakenUniq output filenames unique.
Nice, OK I will test the following commands: Normal test, no run merging (about 4-6 profiles) nextflow run ../main.nf -profile test_krakenuniq,docker --outdir ./results_ku_testinputonly_runmerging --shortread_qc_mergepairs false --save_runmerged_reads Normal test, with run merging (2 profiles) nextflow run ../main.nf -profile test_krakenuniq,docker --outdir ./results_ku_testinputonly_norunmerging --shortread_qc_mergepairs false --save_runmerged_reads --perform_runmerging Then with dots with ids, no run merging nextflow run ../main.nf -profile test_krakenuniq,docker --outdir ./results_ku_dotsinname_norunmerging --input samplesheet_dots_in_ids.csv --shortread_qc_mergepairs false ---save_runmerged_reads Then with dots with ids, with run merging nextflow run ../main.nf -profile test_krakenuniq,docker --outdir ./results_ku_dotsinname_runmerging --input samplesheet_dots_in_ids.csv --shortread_qc_mergepairs false --perform_runmerging --save_runmerged_reads |
I'll request you to re-review then. You can just approve when you are satisfied with the tests. |
So those commands worked nicely with no crash as you see below. I need to try one more test where I merge pairs so SE/PE can be merged together (and considered SE)
|
And that look sgood too! nextflow run ../main.nf -profile test_krakenuniq,docker --outdir ./results_ku_testinputonly_runmerging_readmerge --shortread_qc_mergepairs true --save_runmerged_reads $ ls -l results_ku_testinputonly_runmerging_readmerge/krakenuniq/db6/
total 20
-rw-r--r-- 1 james james 427 Nov 21 12:47 2611.se.krakenuniq.report.txt
-rw-r--r-- 1 james james 421 Nov 21 12:47 2612.se.krakenuniq.report.txt
-rw-r--r-- 1 james james 432 Nov 21 12:47 2613.se.krakenuniq.report.txt
-rw-r--r-- 1 james james 414 Nov 21 12:47 2614.se.krakenuniq.report.txt
-rw-r--r-- 1 james james 445 Nov 21 12:47 ERR3201952.se.krakenuniq.report.txt |
Thanks @Midnighter ! |
Please note that the module changes are currently copied over from a module update that I'm trying to get merged nf-core/modules#6912. That needs to be reverted later and the module updated properly.
This PR changes some of the channel plumbing to introduce a prefix for each sequencing reads file (or pairs).
Some doubt remains about the use of
meta.id
as the prefix. I think, that refers to the sample identifier which may be non-unique if multiple runs are not merged. Potentially, that has to be changed to something else.I have run a local test and it seems to address #533