Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

metabat2 "[Error!] the order of contigs in abundance file is not the same as the assembly file" #58

Closed
jfourquet2 opened this issue Jun 30, 2020 · 10 comments

Comments

@jfourquet2
Copy link

Hi,
I have been testing mag on two samples 3 months ago and all the pipeline well ran.
I wanted to re-test mag on two others samples yesterday and I have this issue, only for metabat2 associated with MEGAHIT assembly (all is ok for SPADES assembly).
It seems there is a reversal of file "first" and file "second" in metabat command line and an other issue with the order of contigs.
I just re-run mag with the last version of Nextflow nf-core (nfcore-Nextflow-v20.01.0) to see if the issue persists.
Do you have a solution for this issue ?
Thanks a lot in advance !

ERROR ~ Error executing process > 'metabat (MEGAHIT-first)'

Caused by:
  Process `metabat (MEGAHIT-first)` terminated with an error exit status (1)

Command executed:

  jgi_summarize_bam_contig_depths --outputDepth depth.txt MEGAHIT-first-first.bam MEGAHIT-first-second.bam
  metabat2 -t "8" -i "second.contigs.fa" -a depth.txt -o "MetaBAT2/MEGAHIT-first" -m 1500
  
  #if bin folder is empty
  if [ -z "$(ls -A MetaBAT2)" ]; then 
      cp second.contigs.fa MetaBAT2/MEGAHIT-second.contigs.fa
  fi

Command exit status:
  1

Command output:
  MetaBAT 2 (v2.13 (Bioconda)) using minContig 1500, minCV 1.0, minCVSum 1.0, maxP 95%, minS 60, and maxEdges 200. 

Command error:
  Output depth matrix to depth.txt
  jgi_summarize_bam_contig_depths 2.13 (Bioconda) 2019-06-11T06:53:12
  Output matrix to depth.txt
  0: Opening bam: MEGAHIT-first-first.bam
  1: Opening bam: MEGAHIT-first-second.bam
  Processing bam files
  Thread 0 finished: MEGAHIT-first-first.bam with 52840392 reads and 51019403 readsWellMapped
  Thread 1 finished: MEGAHIT-first-second.bam with 55185564 reads and 53195312 readsWellMapped
  Creating depth matrix file: depth.txt
  Closing most bam files
  Closing last bam file
  Finished
  [Error!] the order of contigs in abundance file is not the same as the assembly file: k141_0

@d4straub
Copy link
Collaborator

Please have a look at #32 , thats the identical issue.
Problem can be solved by either (1) resume the pipeline with -resume, that seems to help usually or (2) use the dev branch (-r dev), there is a fix.

I close here cause its a duplicate.

@jfourquet2
Copy link
Author

Thanks a lot !

@jfourquet2
Copy link
Author

For information, I re-run my initial script with mag 1.0.0 (and not dev version) and it works well with nfcore-Nextflow-v20.01.0 (I ran my previous script with nfcore-Nextflow-v19.04.0).
The run with dev version and nfcore-Nextflow-v19.04.0 finished with an error :

ERROR ~ Error executing process > 'quast_bins (SPAdes-mock_first)'

Caused by:
  Process `quast_bins (SPAdes-mock_first)` terminated with an error exit status (4)

Command executed:

  ASSEMBLIES=$(echo "SPAdes-mock_first.1.fa SPAdes-mock_first.10.fa SPAdes-mock_first.11.fa SPAdes-mock_first.12.fa SPAdes-mock_first.13.fa SPAdes-mock_first.14.fa SPAdes-mock_first.15.fa SPAdes-mock_first.16.fa SPAdes-mock_first.17.fa SPAdes-mock_first.18.fa SPAdes-mock_first.19.fa SPAdes-mock_first.2.fa SPAdes-mock_first.20.fa SPAdes-mock_first.21.fa SPAdes-mock_first.22.fa SPAdes-mock_first.23.fa SPAdes-mock_first.24.fa SPAdes-mock_first.25.fa SPAdes-mock_first.26.fa SPAdes-mock_first.27.fa SPAdes-mock_first.28.fa SPAdes-mock_first.29.fa SPAdes-mock_first.3.fa SPAdes-mock_first.30.fa SPAdes-mock_first.31.fa SPAdes-mock_first.32.fa SPAdes-mock_first.33.fa SPAdes-mock_first.34.fa SPAdes-mock_first.35.fa SPAdes-mock_first.36.fa SPAdes-mock_first.37.fa SPAdes-mock_first.38.fa SPAdes-mock_first.39.fa SPAdes-mock_first.4.fa SPAdes-mock_first.40.fa SPAdes-mock_first.5.fa SPAdes-mock_first.6.fa SPAdes-mock_first.7.fa SPAdes-mock_first.8.fa SPAdes-mock_first.9.fa SPAdes-mock_first.lowDepth.fa SPAdes-mock_first.tooShort.fa SPAdes-mock_first.unbinned.pooled.fa" | sed 's/[][]//g')
  IFS=', ' read -r -a assemblies <<< "$ASSEMBLIES"
  
  for assembly in "${assemblies[@]}"; do
      metaquast.py --threads "1" --max-ref-number 0 --rna-finding --gene-finding -l "${assembly}" "${assembly}" -o "QUAST/${assembly}"
      if ! [ -f "QUAST/SPAdes-mock_first-quast_summary.tsv" ]; then 
          cp "QUAST/${assembly}/transposed_report.tsv" "QUAST/SPAdes-mock_first-quast_summary.tsv"
      else
          tail -n +2 "QUAST/${assembly}/transposed_report.tsv" >> "QUAST/SPAdes-mock_first-quast_summary.tsv"
      fi
  done

Command exit status:
  4

Command output:
  
  NOTICE: Genes are not predicted by default. Use --gene-finding or --glimmer option to enable it.
  
  2020-07-01 09:00:35
  Running Barrnap...
  Logging to QUAST/SPAdes-mock_first.9.fa/predicted_genes/barrnap.log...
      Ribosomal RNA genes = 0
      Predicted genes (GFF): QUAST/SPAdes-mock_first.9.fa/predicted_genes/SPAdes-mock_first-9-fa.rna.gff
  Done.
  
  2020-07-01 09:00:39
  Creating large visual summaries...
  This may take a while: press Ctrl-C to skip this step..
    1 of 2: Creating Icarus viewers...
    2 of 2: Creating PDF with all tables and plots...
  Done
  
  2020-07-01 09:00:39
  RESULTS:
    Text versions of total report are saved to QUAST/SPAdes-mock_first.9.fa/report.txt, report.tsv, and report.tex
    Text versions of transposed total report are saved to QUAST/SPAdes-mock_first.9.fa/transposed_report.txt, transposed_report.tsv, and transposed_report.tex
    HTML version (interactive tables and plots) is saved to QUAST/SPAdes-mock_first.9.fa/report.html
    PDF version (tables and plots) is saved to QUAST/SPAdes-mock_first.9.fa/report.pdf
    Icarus (contig browser) is saved to QUAST/SPAdes-mock_first.9.fa/icarus.html
    Log is saved to QUAST/SPAdes-mock_first.9.fa/quast.log
  
  Finished: 2020-07-01 09:00:39
  Elapsed time: 0:00:05.786084
  NOTICEs: 2; WARNINGs: 2; non-fatal ERRORs: 0
  
  Thank you for using QUAST!
  /opt/conda/envs/nf-core-mag-1.1.0dev/lib/python3.6/site-packages/quast-5.0.2-py3.6.egg-info/scripts/metaquast.py --threads 1 --max-ref-number 0 --rna-finding --gene-finding -l SPAdes-mock_first.lowDepth.fa SPAdes-mock_first.lowDepth.fa -o QUAST/SPAdes-mock_first.lowDepth.fa
  
  Version: 5.0.2
  
  System information:
    OS: Linux-3.10.0-514.26.2.el7.x86_64-x86_64-with-debian-10.1 (linux_64)
    Python version: 3.6.7
    CPUs number: 64
  
  Started: 2020-07-01 09:00:39
  
  Logging to QUAST/SPAdes-mock_first.lowDepth.fa/metaquast.log
  
  Contigs:
    Pre-processing...
  WARNING: Skipping SPAdes-mock_first.lowDepth.fa because it doesn't contain contigs >= 0 bp.
  
  
  ERROR! None of the assembly files contains correct contigs. Please, provide different files or decrease --min-contig threshold.

Command wrapper:
  
  NOTICE: Genes are not predicted by default. Use --gene-finding or --glimmer option to enable it.
  
  2020-07-01 09:00:35
  Running Barrnap...
  Logging to QUAST/SPAdes-mock_first.9.fa/predicted_genes/barrnap.log...
      Ribosomal RNA genes = 0
      Predicted genes (GFF): QUAST/SPAdes-mock_first.9.fa/predicted_genes/SPAdes-mock_first-9-fa.rna.gff
  Done.
  
  2020-07-01 09:00:39
  Creating large visual summaries...
  This may take a while: press Ctrl-C to skip this step..
    1 of 2: Creating Icarus viewers...
    2 of 2: Creating PDF with all tables and plots...
  Done
  
  2020-07-01 09:00:39
  RESULTS:
    Text versions of total report are saved to QUAST/SPAdes-mock_first.9.fa/report.txt, report.tsv, and report.tex
    Text versions of transposed total report are saved to QUAST/SPAdes-mock_first.9.fa/transposed_report.txt, transposed_report.tsv, and transposed_report.tex
    HTML version (interactive tables and plots) is saved to QUAST/SPAdes-mock_first.9.fa/report.html
    PDF version (tables and plots) is saved to QUAST/SPAdes-mock_first.9.fa/report.pdf
    Icarus (contig browser) is saved to QUAST/SPAdes-mock_first.9.fa/icarus.html
    Log is saved to QUAST/SPAdes-mock_first.9.fa/quast.log
  
  Finished: 2020-07-01 09:00:39
  Elapsed time: 0:00:05.786084
  NOTICEs: 2; WARNINGs: 2; non-fatal ERRORs: 0
  
  Thank you for using QUAST!
  /opt/conda/envs/nf-core-mag-1.1.0dev/lib/python3.6/site-packages/quast-5.0.2-py3.6.egg-info/scripts/metaquast.py --threads 1 --max-ref-number 0 --rna-finding --gene-finding -l SPAdes-mock_first.lowDepth.fa SPAdes-mock_first.lowDepth.fa -o QUAST/SPAdes-mock_first.lowDepth.fa
  
  Version: 5.0.2
  
  System information:
    OS: Linux-3.10.0-514.26.2.el7.x86_64-x86_64-with-debian-10.1 (linux_64)
    Python version: 3.6.7
    CPUs number: 64
  
  Started: 2020-07-01 09:00:39
  
  Logging to QUAST/SPAdes-mock_first.lowDepth.fa/metaquast.log
  
  Contigs:
    Pre-processing...
  WARNING: Skipping SPAdes-mock_first.lowDepth.fa because it doesn't contain contigs >= 0 bp.
  
  
  ERROR! None of the assembly files contains correct contigs. Please, provide different files or decrease --min-contig threshold.

I will re-run the dev versin with nfcore-Nextflow-v20.01.0 and I think it will be ok !

@d4straub
Copy link
Collaborator

d4straub commented Jul 2, 2020

Thanks a lot for your feedback!

For fixing this immediately, you can try running your command appending -c quast.config -resume where quast.config contains:

process {
  withName: quast_bins {
    errorStrategy = { task.exitStatus in [143,137] ? 'retry' : 'ignore' }
  }
}

However, this is a deeper problem obviously: The files SPAdes-mock_first.lowDepth.fa SPAdes-mock_first.tooShort.fa SPAdes-mock_first.unbinned.pooled.fa are not supposed to be processed by this process.
This might be an issue of a recent change, could you please provide what exact commit you used?

@skrakau might that be an issue that needs to be resolved or is it already solved?

@d4straub d4straub reopened this Jul 2, 2020
@jfourquet2
Copy link
Author

Thanks for your answer !
My command line was:

nextflow run nf-core/mag -profile genotoul --reads '../mag/*_R{1,2}.fastq.gz' --busco_reference '/work2/g
enphyse/NED_metaG/src/mag/bacteria_odb9.tar.gz' --kraken2_db '/work2/genphyse/NED_metaG/src/mag/minikrake
n2_v2_8GB_201904_UPDATE.tgz' --centrifuge_db '/work2/genphyse/NED_metaG/src/mag/p_compressed+h+v.tar.gz' 
--cat_db '/work2/genphyse/NED_metaG/src/mag/CAT_prepare_20190108.tar.gz' -with-timeline -with-trace -with
-report -with-dag -r dev

And in software_versions.csv there is all tools and versions of tools (and version of mag dev):

nf-core/mag	v1.1.0dev
Nextflow	v19.04.0
MultiQC	v1.9
FastQC	v0.11.8
fastp	v0.20.0
MEGAHIT	v1.2.7
Metabat	v2:2.15
NanoPlot	v1.26.3
Filtlong	v0.2.0
Porechop	v0.2.3_seqan2.1.1

@d4straub
Copy link
Collaborator

d4straub commented Jul 2, 2020

Thanks and sorry, I was not clear. What I meant was "revision" info, here:

N E X T F L O W  ~  version 20.01.0
Launching `d4straub/mag` [spontaneous_hopper] - revision: f0206789cc [d4straub-v0.2-exportDepth]
[2m----------------------------------------------------
                                        ,--./,-.
        ___     __   __   __   ___     /,-._.--~'
  |\ | |__  __ /  ` /  \ |__) |__         }  {
  | \| |       \__, \__/ |  \ |___     \`-._,-`-,
                                        `._,._,'
  nf-core/mag v1.0.0
----------------------------------------------------

or commit hash in results/pipeline_info/execution_report.html:

Workflow repository
    https://github.com/d4straub/mag.git, revision d4straub-v0.2-exportDepth (commit hash f0206789cccf699e69652f0223e2148f7c1f6ea7)

This is to dentify what version/revision/commit of dev you used to then judge whether this might have been fixed already.

@jfourquet2
Copy link
Author

It was very clear but I didn't know how to have the number of the commit, thanks for your answer !
Workflow repository
https://github.com/nf-core/mag.git, revision dev (commit hash 973f8d4)

@skrakau
Copy link
Member

skrakau commented Jul 2, 2020

Hi, the problem regarding the *.lowDepth.fa file was introduced with the new MetBAT2 version 2.15, and solved in commit 60cd556

@jfourquet2
Copy link
Author

Ok, thanks a lot !

@skrakau
Copy link
Member

skrakau commented Jul 3, 2020

Thanks for reporting @jfourquet2. I will close this, let us know if a problem remains.

@skrakau skrakau closed this as completed Jul 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants