-
Notifications
You must be signed in to change notification settings - Fork 111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add CAT_SUMMARY process and offical_taxonomy param #366
Conversation
|
@nf-core-bot fix linting |
Finally got a chance to test this on our cluster - all working fine now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
I also restarted test runs because they seem to have failed due to connectivity problems. Linting is expected to fail because of the template update.
In case all test pass (except linting) but you cannot merge the PR, ping me and I'll do it.
@d4straub All passing except linting as well - could you merge? Many thanks! |
Added a CAT_SUMMARY process to summarise the output of CAT as a single final file, as currently there is a single output per assembly group (a lot if you have many single-sample assemblies!). I wanted to re-use the COMBINE_TSV process (as I essentially re-wrote it before realising it exists), but the output files from CAT are gzipped so I've cloned it and added an ungzipping line, rather than modify either the CAT output or push all the files through the GUNZIP process. I don't like duplicating code but this seemed like the simplest solution.
Also added an option to make CAT only output 'official' taxonomy, i.e. Kingdom, Phylum, etc. - I noticed this was available in the CAT documentation and thought it would be useful, as currently munging the taxonomy with many mismatched and empty fields is quite annoying!
Haven't run any tests yet as I've got a pipeline running on our cluster where I have the database downloaded (I don't think CAT will run on Gitpod?), but will un-draft this once I've had a chance to run a test/if anyone else wants to run a quick test.
PR checklist
nf-core lint
).nextflow run . -profile test,docker --outdir <OUTDIR>
).docs/usage.md
is updated.docs/output.md
is updated.CHANGELOG.md
is updated.README.md
is updated (including new tool citations and authors/contributors).