Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pipeline - draft1 #1

Draft
wants to merge 45 commits into
base: dev
Choose a base branch
from
Draft

Pipeline - draft1 #1

wants to merge 45 commits into from

Conversation

prototaxites
Copy link
Contributor

@prototaxites prototaxites commented Nov 26, 2024

Adds:

  • Input from YAML
  • Assembly
    • metaMDBG
    • gene prediction with pyrodigal
    • assembly statistics/num circular
  • Read mapping to assembly - hi-c and pacbio
  • Binning with
    • metabat2
    • maxbin2
    • bin3c
    • metator
  • Bin refinement with
    • magscot
    • dastool
  • Bin QC
    • Checkm
    • Seqkit statistics
    • prokka - count trnas
  • Taxonomy
    • GTDBTk
    • TaxKit - get NCBI taxids
  • Summary
    • assembly statistics
    • bin statistics and scoring

Jim Downie and others added 9 commits November 14, 2024 14:23
- Add sketch binning subworkflow
- Map reads + hic to reference
- Patch read mapping modules so they output the meta object
from the reference
- Calculate depths
- binnng workflow has metabat2,maxbin2 and bin3c
- bin3c modules
- read hic enzymes input from YAML
- move hic cram preprocessing to new PREPARE_DATA workflow

Minor updates:
- Patch MaxBin2 to emit ungzipped fasta
Skeleton of binning with Metator - needs uncontaminated container to
    continue.
    - Bin refinement with Magscot

Fixes:
    - Binning with Metator now functional
@prototaxites prototaxites self-assigned this Nov 26, 2024
@prototaxites prototaxites marked this pull request as ready for review November 26, 2024 15:52
Merge pull request #1 from prototaxites/main
prototaxites added a commit that referenced this pull request Nov 27, 2024
@prototaxites prototaxites changed the title Pipeline skeleton - input, assembly, binning and bin refinement Pipeline - draft1 Nov 28, 2024
@prototaxites prototaxites marked this pull request as draft December 6, 2024 11:43
@sanger-tol sanger-tol deleted a comment from github-actions bot Dec 9, 2024
  - prefix all gawk processes with GAWK_ for clarity
  - Add taxonkit module to convert NCBI names to taxids
  - Operate on amino acid bins in CheckM and GTDB-Tk
    - to faciliate this, contig2bintofasta module now
    works in regex mode when pyrodigal fasta is provided
Copy link

github-actions bot commented Dec 11, 2024

nf-core pipelines lint overall result: Passed ✅ ⚠️

Posted for pipeline commit 9d3f933

+| ✅ 193 tests passed       |+
#| ❔  33 tests were ignored |#
!| ❗  11 tests had warnings |!

❗ Test warnings:

  • readme - README contains the placeholder zenodo.XXXXXXX. This should be replaced with the zenodo doi (after the first release).
  • pipeline_todos - TODO string in README.md: Include a figure that guides the user through the major workflow steps. Many nf-core
  • pipeline_todos - TODO string in README.md: Add citation for pipeline after first release. Uncomment lines below and update Zenodo doi and badge at the top of this file.
  • pipeline_todos - TODO string in README.md: Add bibliography of tools and data used in your pipeline
  • pipeline_todos - TODO string in methods_description_template.yml: #Update the HTML below to your preferred methods description, e.g. add publication citation for this pipeline
  • pipeline_todos - TODO string in output.md: Write this documentation describing your workflow's output
  • pipeline_todos - TODO string in usage.md: Add documentation about anything specific to running your pipeline. For general topics, please point to (and add to) the main nf-core website.
  • pipeline_todos - TODO string in main.nf: Optionally add in-text citation tools to this list.
  • pipeline_todos - TODO string in main.nf: Optionally add bibliographic entries to this list.
  • pipeline_todos - TODO string in main.nf: Only uncomment below if logic in toolCitationText/toolBibliographyText has been filled!
  • system_exit - System.exit in WorkflowTreeVal.groovy: System.exit(1) [line 17]

❔ Tests ignored:

  • files_exist - File is ignored: .github/ISSUE_TEMPLATE/config.yml
  • files_exist - File is ignored: conf/igenomes_ignored.config
  • files_exist - File is ignored: conf/igenomes.config
  • files_exist - File is ignored: assets/email_template.html
  • files_exist - File is ignored: assets/sendmail_template.txt
  • files_exist - File is ignored: assets/email_template.txt
  • files_exist - File is ignored: CODE_OF_CONDUCT.md
  • files_exist - File is ignored: assets/nf-core-longreadmag_logo_light.png
  • files_exist - File is ignored: docs/images/nf-core-longreadmag_logo_light.png
  • files_exist - File is ignored: docs/images/nf-core-longreadmag_logo_dark.png
  • files_exist - File is ignored: .github/workflows/awstest.yml
  • files_exist - File is ignored: .github/workflows/awsfulltest.yml
  • nextflow_config - Config variable ignored: manifest.name
  • nextflow_config - Config variable ignored: manifest.homePage
  • nextflow_config - Config variable ignored: process.cpus
  • nextflow_config - Config variable ignored: process.memory
  • nextflow_config - Config variable ignored: process.time
  • nextflow_config - Config variable ignored: validation.help.beforeText
  • nextflow_config - Config variable ignored: validation.help.afterText
  • nextflow_config - Config variable ignored: validation.summary.beforeText
  • nextflow_config - Config variable ignored: validation.summary.afterText
  • files_unchanged - File ignored due to lint config: CODE_OF_CONDUCT.md
  • files_unchanged - File ignored due to lint config: .github/CONTRIBUTING.md
  • files_unchanged - File ignored due to lint config: .github/ISSUE_TEMPLATE/bug_report.yml
  • files_unchanged - File does not exist: .github/ISSUE_TEMPLATE/config.yml
  • files_unchanged - File does not exist: assets/email_template.html
  • files_unchanged - File does not exist: assets/email_template.txt
  • files_unchanged - File does not exist: assets/sendmail_template.txt
  • files_unchanged - File ignored due to lint config: assets/nf-core-longreadmag_logo_light.png
  • files_unchanged - File ignored due to lint config: docs/images/nf-core-longreadmag_logo_light.png
  • files_unchanged - File ignored due to lint config: docs/images/nf-core-longreadmag_logo_dark.png
  • files_unchanged - File ignored due to lint config: .gitignore or .prettierignore
  • actions_awstest - 'awstest.yml' workflow not found: /home/runner/work/longreadmag/longreadmag/.github/workflows/awstest.yml

✅ Tests passed:

Run details

  • nf-core/tools version 3.0.2
  • Run at 2024-12-20 15:25:15

@sanger-tolsoft
Copy link

Warning

Newer version of the nf-core template is available.

Your pipeline is using an old version of the nf-core template: 3.0.2.
Please update your pipeline to the latest version.

For more documentation on how to update your pipeline, please see the nf-core documentation and Synchronisation documentation.

Jim Downie added 7 commits December 13, 2024 21:06
    - bin summary script and module
    - Add parameter for GTDBTk mash db
    - remove AA bins as they didn't meaningfully speed up downstream
      processes
    - Fix linting
    - Add descriptions in json schema for all parameters and groups
    - switch DASTOOL fastatocontig2bin to local module (faster)
    - patch checkm2/predict module to rename output tsv file
    - refactor bin refinement subwf to remove duplicated steps
…bin summary process; refactor bin summary code
- Add nf-test skeleton code
- Change Metator input to BAM files
- Update conda-checking code in bin3c modules
- Add paramters to choose minimum contig size and minimum map % identity
  when binning
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants