-
Notifications
You must be signed in to change notification settings - Fork 111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add DAS_Tool binning refinement #291
Conversation
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, but do I understand correct that the refined bins will not be used in downstream analysis (Prokka, QUAST, BUSCO, GTDB-tk, CAT) but the non-refined bins? I would hope that when I choose to run DAS_Tool, that those refined bins are used for downstream analysis. Or will too many bins get lost? Is there a reason not to use the refined bins?
just realised, it's currently not handled when |
Good catch! Updating |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me but I have no time at the moment to run it and have a look at output file location/content.
Hi @jfy133 , why did you revert merging |
Not sure, I guess it broke something and I forgot to merge it back in 🤦♀. Will fix it after this meeting! Sorry! |
… input channels accordingly
…_depths_summary.tsv
Hi @jfy133 and @d4straub, I continued work on this PR, it would be great if you could have a look again and let me know what you think. Most important changes:
I hope there wasn't a reason that I forgot and didn't see, e.g. causing issues for downstream processes, why |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So RENAME_PREDASTOOL
is not in the modules.config
? Seems fine to me, just want to confirm that this is intended.
Looks good to me, but I did not run the code and I lost track of all the changes in this PR, meaning I might have missed something.
Co-authored-by: Daniel Straub <42973691+d4straub@users.noreply.github.com>
Thanks @d4straub for reviewing! Yes, no files from |
Some comments from me: For the workflow diagram, is QUAST meant to be abdunance estimation and visualisation? I would flip it if so. But I would assume QUAST is normally for Evaluation, however then it's not clear what the Estimation/Visualisation is referring to? (I have ideas to make more space if you need).
This is OK to me, the only real problem there is you would have a file name mismatch between the
Whoops, good catch!
I originally kept that in as IIRC the unbinned files/output files from
I think if MAG_DEPTHS can cope with this now, that's OK. I didn't set the binner as everything for I will test the pipeline now to make sure everything looks OK from my perspective, but otherwise I think everything looks OK from my end now! |
Co-authored-by: James A. Fellows Yates <jfy133@gmail.com>
True, it's a bit confusing. "Abundance estimation and visualisation" is done by custom scripts within the pipeline, i.e. there is no tool name we could list for the figure as for the other parts (same holds for "MAG summary").
Ah true! Ok, sorry I missed that you enumerated the input bin files new. So with
For the input for
Good that you are bringing this up, maybe I understood it wrong. Why is this conceptually incorrect? Isn't the |
Ahh I see. Maybe you could wrap in brackets or something? But maybe not so important.
Ah wait, you're right actually. I got confused with the DASTOOL-POSTRENAMING (where I originally did the renaming), so this isn't so bad at that stage. (I forgot that I moved where we did that 😅). Yes, the I think the least damaging compromise is to leave it as you have it now, and we just document it. I doubt many people would look at
Oops missed that
Yes that is also true... I was thinking more from a file-based perspective, as in they just 'update' the existing FASTA files, and retain the same file name. I had thought I hadn't seen any of the FASTAs being entirely removed (for example), but maybe I misremembered here (I will check in in the test I'm running now). Ok then overall I think you have corrected my misunderstandings, so I think everything looks OK now 👍 - I'll let you know on slack if the tests run as I expect it, and then you can ... I guess approve the PR on behalf of both of us 😅 👍 |
This maybe because MAXBIN fails:
MaxBin2 wasn't as reliable as MetaBAT2 if I reemember correctly what my colleague told me |
just for completeness here: for the |
Co-authored-by: James A. Fellows Yates <jfy133@gmail.com>
PR checklist
nf-core lint
).nextflow run . -profile test,docker
).docs/usage.md
is updated.docs/output.md
is updated.CHANGELOG.md
is updated.README.md
is updated (including new tool citations and authors/contributors).