Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update krakenuniq #550

Merged
merged 6 commits into from
Nov 21, 2024
Merged

Update krakenuniq #550

merged 6 commits into from
Nov 21, 2024

Conversation

Midnighter
Copy link
Collaborator

@Midnighter Midnighter commented Oct 31, 2024

Please note that the module changes are currently copied over from a module update that I'm trying to get merged nf-core/modules#6912. That needs to be reverted later and the module updated properly.

This PR changes some of the channel plumbing to introduce a prefix for each sequencing reads file (or pairs).

Some doubt remains about the use of meta.id as the prefix. I think, that refers to the sample identifier which may be non-unique if multiple runs are not merged. Potentially, that has to be changed to something else.

I have run a local test and it seems to address #533

The KrakenUniq module has a breaking change requiring a further input.
@Midnighter Midnighter self-assigned this Oct 31, 2024
Copy link

github-actions bot commented Oct 31, 2024

nf-core pipelines lint overall result: Passed ✅

Posted for pipeline commit 3022cd2

+| ✅ 281 tests passed       |+

✅ Tests passed:

Run details

  • nf-core/tools version 3.0.2
  • Run at 2024-11-21 11:57:27

@Midnighter Midnighter requested a review from jfy133 October 31, 2024 14:05
@jfy133 jfy133 marked this pull request as draft November 5, 2024 11:29
Copy link
Member

@jfy133 jfy133 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM (and tested) once the official module is updated

@Midnighter Midnighter marked this pull request as ready for review November 5, 2024 19:46
@Midnighter
Copy link
Collaborator Author

@jfy133 did you by any chance test this also without run merging? I'm still a bit in doubt about the meta.id.

Copy link
Member

@jfy133 jfy133 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ONe thing I didn't check is if Taxpasta is still OK with this (no missing data), I'm assuming as it didn't fail it's OK but can you confirm from your tests @Midnighter ?

@jfy133
Copy link
Member

jfy133 commented Nov 6, 2024

Oh haha, just saw your question! will see if I can run again (DB wifi though)

@Midnighter
Copy link
Collaborator Author

How do you think taxpasta would be affected? I don't see a potential risk right now?

@jfy133
Copy link
Member

jfy133 commented Nov 6, 2024

Same issue as you were referring to really - file names. I'm stuck on porechop_ABI for some reason, but suspect it's waiting for something to download still :grimace:

@Midnighter
Copy link
Collaborator Author

The taxpasta module uses a proper prefix definition, so I there should be no issues, there.

def prefix = task.ext.prefix ?: "${meta.id}"

@jfy133 jfy133 marked this pull request as draft November 11, 2024 08:03
@jfy133
Copy link
Member

jfy133 commented Nov 11, 2024

Currently doesn't work because of a file collision when there are SE and PE data in one run and run merging is not performed, as it results in file collisions.

I think it needs a mdoules.conf update that a KrakenUniq SE run has se appended to the prefix, and pe for the paired end (for example)

@Midnighter
Copy link
Collaborator Author

Unfortunately, can't use module config due to the batching :( but I'll update the pipeline channel logic.

@jfy133
Copy link
Member

jfy133 commented Nov 11, 2024

Maybe embed it in the module? I think it it reasonable in this case

@jfy133
Copy link
Member

jfy133 commented Nov 21, 2024

As discussed on Slack huddle: we don't need to embed in the module, just update the list of prefixes coming in via the new input channel for the module to add .se or .pe accordingly :)

In the rare case that there are two samples with the same identifier and
run accession, one being single-end and the other paired-end, this
change makes the prefix and thus the KrakenUniq output filenames unique.
@Midnighter Midnighter marked this pull request as ready for review November 21, 2024 09:54
@jfy133
Copy link
Member

jfy133 commented Nov 21, 2024

Nice, OK I will test the following commands:

Normal test, no run merging (about 4-6 profiles)

nextflow run ../main.nf -profile test_krakenuniq,docker --outdir ./results_ku_testinputonly_runmerging --shortread_qc_mergepairs false --save_runmerged_reads

Normal test, with run merging (2 profiles)

nextflow run ../main.nf -profile test_krakenuniq,docker --outdir ./results_ku_testinputonly_norunmerging --shortread_qc_mergepairs false --save_runmerged_reads --perform_runmerging

Then with dots with ids, no run merging

nextflow run ../main.nf -profile test_krakenuniq,docker --outdir ./results_ku_dotsinname_norunmerging --input samplesheet_dots_in_ids.csv --shortread_qc_mergepairs false ---save_runmerged_reads 

Then with dots with ids, with run merging

nextflow run ../main.nf -profile test_krakenuniq,docker --outdir ./results_ku_dotsinname_runmerging --input samplesheet_dots_in_ids.csv --shortread_qc_mergepairs false --perform_runmerging --save_runmerged_reads 

@Midnighter
Copy link
Collaborator Author

Midnighter commented Nov 21, 2024

I'll request you to re-review then. You can just approve when you are satisfied with the tests.

@Midnighter Midnighter requested a review from jfy133 November 21, 2024 10:15
@jfy133
Copy link
Member

jfy133 commented Nov 21, 2024

So those commands worked nicely with no crash as you see below.

I need to try one more test where I merge pairs so SE/PE can be merged together (and considered SE)

(nf-core) james@bionb103:~/git/nf-core/taxprofiler/testing (update-krakenuniq)$ ls -l results_ku_testinputonly_runmerging/krakenuniq/db6/
total 24
-rw-r--r-- 1 james james 427 Nov 21 11:13 2611.se.krakenuniq.report.txt
-rw-r--r-- 1 james james 464 Nov 21 11:13 2612.pe.krakenuniq.report.txt
-rw-r--r-- 1 james james 432 Nov 21 11:12 2612.se.krakenuniq.report.txt
-rw-r--r-- 1 james james 489 Nov 21 11:13 2613.pe.krakenuniq.report.txt
-rw-r--r-- 1 james james 414 Nov 21 11:12 2614.se.krakenuniq.report.txt
-rw-r--r-- 1 james james 445 Nov 21 11:12 ERR3201952.se.krakenuniq.report.txt
(nf-core) james@bionb103:~/git/nf-core/taxprofiler/testing (update-krakenuniq)$ ls -l results_ku_dotsinname_norunmerging/krakenuniq/db6/
total 32
-rw-r--r-- 1 james james 435 Nov 21 11:54 2611.test.in.se.krakenuniq.report.txt
-rw-r--r-- 1 james james 529 Nov 21 11:54 2612.test.in.pe2.pe.krakenuniq.report.txt
-rw-r--r-- 1 james james 456 Nov 21 11:53 2612.test.in.se2.se.krakenuniq.report.txt
-rw-r--r-- 1 james james 454 Nov 21 11:53 2612.test.in.se.se.krakenuniq.report.txt
-rw-r--r-- 1 james james 516 Nov 21 11:54 2612.test.pe1.pe.krakenuniq.report.txt
-rw-r--r-- 1 james james 513 Nov 21 11:54 2613.test.in.pe.krakenuniq.report.txt
-rw-r--r-- 1 james james 414 Nov 21 11:54 2614.se.krakenuniq.report.txt
-rw-r--r-- 1 james james 445 Nov 21 11:53 ERR3201952.se.krakenuniq.report.txt
(nf-core) james@bionb103:~/git/nf-core/taxprofiler/testing (update-krakenuniq)$ ls -l results_ku_dotsinname_runmerging/krakenuniq/db6/
total 32
-rw-r--r-- 1 james james 435 Nov 21 12:04 2611.test.in.se.krakenuniq.report.txt
-rw-r--r-- 1 james james 529 Nov 21 12:05 2612.test.in.pe2.pe.krakenuniq.report.txt
-rw-r--r-- 1 james james 456 Nov 21 12:04 2612.test.in.se2.se.krakenuniq.report.txt
-rw-r--r-- 1 james james 454 Nov 21 12:04 2612.test.in.se.se.krakenuniq.report.txt
-rw-r--r-- 1 james james 516 Nov 21 12:05 2612.test.pe1.pe.krakenuniq.report.txt
-rw-r--r-- 1 james james 513 Nov 21 12:05 2613.test.in.pe.krakenuniq.report.txt
-rw-r--r-- 1 james james 414 Nov 21 12:04 2614.se.krakenuniq.report.txt
-rw-r--r-- 1 james james 445 Nov 21 12:04 ERR3201952.se.krakenuniq.report.txt

@jfy133
Copy link
Member

jfy133 commented Nov 21, 2024

And that look sgood too!

nextflow run ../main.nf -profile test_krakenuniq,docker --outdir ./results_ku_testinputonly_runmerging_readmerge --shortread_qc_mergepairs true --save_runmerged_reads
$ ls -l results_ku_testinputonly_runmerging_readmerge/krakenuniq/db6/
total 20
-rw-r--r-- 1 james james 427 Nov 21 12:47 2611.se.krakenuniq.report.txt
-rw-r--r-- 1 james james 421 Nov 21 12:47 2612.se.krakenuniq.report.txt
-rw-r--r-- 1 james james 432 Nov 21 12:47 2613.se.krakenuniq.report.txt
-rw-r--r-- 1 james james 414 Nov 21 12:47 2614.se.krakenuniq.report.txt
-rw-r--r-- 1 james james 445 Nov 21 12:47 ERR3201952.se.krakenuniq.report.txt

@jfy133 jfy133 merged commit e58a59c into dev Nov 21, 2024
25 checks passed
@jfy133 jfy133 deleted the update-krakenuniq branch November 21, 2024 12:36
@jfy133
Copy link
Member

jfy133 commented Nov 21, 2024

Thanks @Midnighter !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants