-
Notifications
You must be signed in to change notification settings - Fork 111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Run CAT on unbinnned contigs as well as binned contigs #385
Conversation
|
Did you test that? Maybe I misunderstood, but doesnt your change mean that also all unbinned contigs (potentially millions) are classified by CAT? If I am right, CAT will be unusable for larger (and not very well assembled & binned) assemblies because it will take around >=1000x longer. This would need to be optional or only contigs above a specific length threshold (that would be adjustable) or such. |
On the test data yes, not on real data (I dont have any at hand). Note that the 'unbinned' here means the contigs considered 'OK' from split_fasta, so there is a certain level of filtering going on there. It's not just any random contig. I could make it optional, but maybe it would be to discuss on #125 then before I continue, if this is not sufficient? |
I see, thanks, I didnt make the connection! Indeed, then "only" contigs that are above |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Correct! Basically the same input for e.g. BUSCO and other dowsntream tools :) |
Closes #125
PR checklist
nf-core lint
).nextflow run . -profile test,docker --outdir <OUTDIR>
).docs/usage.md
is updated.docs/output.md
is updated.CHANGELOG.md
is updated.README.md
is updated (including new tool citations and authors/contributors).