-
Notifications
You must be signed in to change notification settings - Fork 118
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Binning & bin refinement extension #252
Comments
At the moment I think I would split into two PRs:
binning
|
Looks good. Some remarks that you might or might not be aware of:
I agree that using nf-core modules as much as possible and I also agree that splitting up the metabat binning into several processes is a good plan. Currently, nf-core/mag is using almost exclusively local modules because we did DSL2 conversion before modules were set up in a stable way. Also, please try to realize also for Maxbin2 the function of --binning_map_mode |
Thanks for the feedback @d4straub , I take everything into consideration 👍 A bit crappy about missing the 4 Mbp contig indeed, so I'll definitely look into how to deal with that. |
Step one will be in #263 |
One bin refinement tool to add to the list would be metawrap (the |
Update, we discovered checkM doesn't (and most likely won't for a long time) work nicely with containers becuase it requires you to modify a system-level file to point the tool to the location of a file [which I only just saw the whole checkM -> BUSCO conversation back in th eDSL1 days 🤦♀]. Given the sceond half of the refinement working described in the OP relies heavily on this, as well as metawrap, we've decided it's not worth implementing it here - as it doesn't make sense to run polyMut/gunc on un-refined low-quality bins. I'll look back at adding DAS_Tool anyway as an option, but I will stop there |
Completed in #291 |
In addition to the aDNA specific extension, we would like to include some expanding of the binning options and also binning refinement.
This was originally written in a snakemake workflow by @alexhbnr, but it would fit in quite well here. So I would start writing this proposal already and ship probably over a couple of PRs.
I'm leaving the following diagrams as follows for discussion, and I'll also add personal 'dev notes' while I start preparing the new workflow proposals:
From @alexhbnr
Dev Notes
2021-11-25
depths
files for binning only partially implemented. Can't work out how to dynamically get themeta.assembler
in there yet due to$suffix
behaviour. Gregor/Mahesh/Harshil have posted suggestions on #DSL2-transition on slack2021-11-29
2021-12-02
.transpose
andgroupTuple
unbinned
to be exporte,d but ahving issues conditionally exporting them2021-12-07
mag/
, not sure why2021-12-07
mag
...2022-01-13
mag_depths
not publishing in the write place issue (had commented out the addParams DSL2 v1 syntax, reactivated made it work as expecteddepth.txt.gz
file is not being published (and a couple of other files).unbinned
directory is published in SPLIT_FASTQ, so it's overwriting it... 🤦♀ [RESOLVED - all files now published, could consider maching publication directories again]2022-02-09
2022-02-11
Depths working, both binners running
Problem with publishing discarded/unbinned output from split_fastq back into the respective binner directories
doesn't work
Need to check why MAG_DEPTHS isn't executing working now with extra assemblers and output looks the same
Currently testing with BUSCO disactivated, need to test with it on
2022-02-18
The text was updated successfully, but these errors were encountered: