-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MIRA v4.0 de novo assembler does not output a collection for collection input #3
Comments
MIRA will happily take multiple FASTQ inputs, e.g. an organism sequenced with multiple libraries or runs, so yes, the tool does deliberately cater to mapping N input files to one assembly (i.e. "reduce" mode) I'd hope the Galaxy GUI would allow you to deliberately choose to run N copies of MIRA instead (i.e. "map" mode), giving N assemblies. Paging @jmchilton as the Galaxy collections expert. |
Thank you for the quick response! Let me also use this opportunity to thank you for writing and maintaining MIRA tools, we use them frequently in our Galaxy instance!
|
The MIRA wrapper is collection aware (via I've not had a chance to play with the interface to confirm how you'd run MIRA to get N jobs from N inputs, but I would expect the collections input control allowed that. |
There are two different concepts here that are clashing I think. You are referring probably to the So both approaches are correct but they imply different UX. One solution would be to remove |
I think this is a Galaxy UI limitation for tools with I'd like to try this locally with Spades - @tshtatland which Spades wrapper are you using? Can you tell me the Tool Shed URL (as there are at least two different wrappers available)? |
Correct - it cannot currently. There was an issue in Trello but I cannot find on Github for this so I've created galaxyproject/galaxy#4623. I included workarounds you can add to Mira if you want the tool to support both modes of operation. Certainly some Galaxy developers would discourage those workarounds - but I tend to be a bit more pragmatic I think. |
Thanks John - this is a very difficult set of concepts to convey to the user, so I can understand why it isn't in the current Galaxy UI. The suggestions you've given for workarounds make sense, but would I think break backward compatibility with the current versions of the MIRA wrapper. Given that, it would be nice for me to take more direct advantage of the paired collection infrastructure as part any changes to the input handling. |
This is the spades tool that corresponds to the screenshot above: |
When used in a workflow, MIRA v4.0 de novo assember does not output a collection, as expected when the input is a collection. I am attaching the workflow screenshot with red and green arrows that highlight the issue. I am using the latest version of the tool on a local Galaxy installation v.17.05:
MIRA v4.0 de novo assember Takes Sanger, Roche 454, Solexa/Illumina, Ion Torrent and PacBio reads (Galaxy Version 0.0.11)
Galaxy Tool Shed - https://toolshed.g2.bx.psu.edu/repository?repository_id=efe8c48b382cb9cc&changeset_revision=1713289d9908
I expected (perhaps incorrectly) most Galaxy tools, such as MIRA assembler, designed so that collection input (N fastq/fasta files, each with millions of reads) produces a collection output (N fasta files, each with a small number of contigs), as shown on the screenshot. N is the number of biologically distinct samples/libraries. From my point of view, most Galaxy tools should not be "reduced". The reducing step should probably be done by a simple reducing tool, later, otherwise the combo tool is not collection-friendly. I wonder if this naive view make sense... Thank you!
The text was updated successfully, but these errors were encountered: