Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot reach https://busco-data.ezlab.org/v5/data/file_versions.tsv #333

Closed
ChristophKnapp opened this issue Aug 25, 2022 · 26 comments
Closed
Labels
bug Something isn't working

Comments

@ChristophKnapp
Copy link

Description of the bug

Hello,
When I start nf-core-mag it runs for some time and then stops with

ERROR: BUSCO analysis failed for some unknown reason! See also MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.err.

See the attached error log. Busco already has an fixed issue with this problem (https://gitlab.com/ezlab/busco/-/issues/567). That's why I post it here first. Tell me to go away if you think they should reopen this issue.

I tried to access https://busco-data.ezlab.org/v5/data/file_versions.tsv with wget and curl and had no problem downloading it from the machine this runs on. Therefore I don't think this is a firewall issue of some sort, but I could be wrong. After all I don't know the exact method how busco is trying this.

I also thought at first that this might be just an internet hickup. So I resumed the analysis after testing whether I could download this file. This was not the case, this will occur every time I resume.

Thanks for your help

Christoph

Command used and terminal output

nextflow run nf-core/mag -profile conda --input '../data/*_R{1,2}.fastq.gz' --outdir results -r fix-convert-depths-gzip -resume
N E X T F L O W  ~  version 22.04.5
Launching `https://github.com/nf-core/mag` [elated_stonebraker] DSL2 - revision: 1b4456d542 [fix-convert-depths-gzip]


------------------------------------------------------
                                        ,--./,-.
        ___     __   __   __   ___     /,-._.--~'
  |\ | |__  __ /  ` /  \ |__) |__         }  {
  | \| |       \__, \__/ |  \ |___     \`-._,-`-,
                                        `._,._,'
  nf-core/mag v2.3.0dev
------------------------------------------------------
Core Nextflow options
  revision        : fix-convert-depths-gzip
  runName         : elated_stonebraker
  launchDir       : /media/NGS/nf-core-workflow
  workDir         : /media/NGS/nf-core-workflow/work
  projectDir      : /home/hummelchen/.nextflow/assets/nf-core/mag
  userName        : hummelchen
  profile         : conda
  configFiles     : /home/hummelchen/.nextflow/assets/nf-core/mag/nextflow.config

Input/output options
  input           : ../data/*_R{1,2}.fastq.gz
  outdir          : results

Generic options
  enable_conda    : true

Quality control for short reads options
  phix_reference  : /home/hummelchen/.nextflow/assets/nf-core/mag/assets/data/GCA_002596845.1_ASM259684v1_genomic.fna.gz

Quality control for long reads options
  lambda_reference: /home/hummelchen/.nextflow/assets/nf-core/mag/assets/data/GCA_000840245.1_ViralProj14204_genomic.fna.gz

Taxonomic profiling options
  gtdb            : https://data.ace.uq.edu.au/public/gtdb/data/releases/release202/202.0/auxillary_files/gtdbtk_r202_data.tar.gz

!! Only displaying parameters that differ from the pipeline defaults !!
------------------------------------------------------
If you use nf-core/mag for your analysis please cite:

* The pipeline publication
  https://doi.org/10.1093/nargab/lqac007

* The pipeline
  https://doi.org/10.5281/zenodo.3589527

* The nf-core framework
  https://doi.org/10.1038/s41587-020-0439-x

* Software dependencies
  https://github.com/nf-core/mag/blob/master/CITATIONS.md
------------------------------------------------------
executor >  local (7)
[47/9be65c] process > NFCORE_MAG:MAG:FASTQC_RAW (NG-30689_QN1_4_3_lib613328_10075_2)                                                                            [100%] 1 of 1, cached: 1 ✔
[3d/396de6] process > NFCORE_MAG:MAG:FASTP (NG-30689_QN1_4_3_lib613328_10075_2)                                                                                 [100%] 1 of 1, cached: 1 ✔
[ac/adbb55] process > NFCORE_MAG:MAG:BOWTIE2_PHIX_REMOVAL_BUILD (GCA_002596845.1_ASM259684v1_genomic.fna.gz)                                                    [100%] 1 of 1, cached: 1 ✔
[16/88f95a] process > NFCORE_MAG:MAG:BOWTIE2_PHIX_REMOVAL_ALIGN (NG-30689_QN1_4_3_lib613328_10075_2)                                                            [100%] 1 of 1, cached: 1 ✔
[9b/ec6fb9] process > NFCORE_MAG:MAG:FASTQC_TRIMMED (NG-30689_QN1_4_3_lib613328_10075_2)                                                                        [100%] 1 of 1, cached: 1 ✔
[-        ] process > NFCORE_MAG:MAG:NANOPLOT_RAW                                                                                                               -
[-        ] process > NFCORE_MAG:MAG:PORECHOP                                                                                                                   -
[-        ] process > NFCORE_MAG:MAG:NANOLYSE                                                                                                                   -
[-        ] process > NFCORE_MAG:MAG:FILTLONG                                                                                                                   -
[-        ] process > NFCORE_MAG:MAG:NANOPLOT_FILTERED                                                                                                          -
[-        ] process > NFCORE_MAG:MAG:CENTRIFUGE_DB_PREPARATION                                                                                                  -
[-        ] process > NFCORE_MAG:MAG:CENTRIFUGE                                                                                                                 -
[-        ] process > NFCORE_MAG:MAG:KRAKEN2_DB_PREPARATION                                                                                                     -
[-        ] process > NFCORE_MAG:MAG:KRAKEN2                                                                                                                    -
[37/8a2ffc] process > NFCORE_MAG:MAG:MEGAHIT (NG-30689_QN1_4_3_lib613328_10075_2)                                                                               [100%] 1 of 1, cached: 1 ✔
[8a/bf0dd1] process > NFCORE_MAG:MAG:SPADES (NG-30689_QN1_4_3_lib613328_10075_2)                                                                                [100%] 1 of 1, cached: 1 ✔
[-        ] process > NFCORE_MAG:MAG:SPADESHYBRID                                                                                                               -
[3c/1903eb] process > NFCORE_MAG:MAG:QUAST (MEGAHIT-NG-30689_QN1_4_3_lib613328_10075_2)                                                                         [100%] 2 of 2, cached: 2 ✔
[6b/450699] process > NFCORE_MAG:MAG:PRODIGAL (NG-30689_QN1_4_3_lib613328_10075_2)                                                                              [100%] 2 of 2, cached: 2 ✔
[bd/0fff10] process > NFCORE_MAG:MAG:BINNING_PREPARATION:BOWTIE2_ASSEMBLY_BUILD (MEGAHIT-NG-30689_QN1_4_3_lib613328_10075_2)                                    [100%] 2 of 2, cached: 2 ✔
[ff/266e2f] process > NFCORE_MAG:MAG:BINNING_PREPARATION:BOWTIE2_ASSEMBLY_ALIGN (MEGAHIT-NG-30689_QN1_4_3_lib613328_10075_2-NG-30689_QN1_4_3_lib613328_10075_2) [100%] 2 of 2, cached: 2 ✔
[cd/528041] process > NFCORE_MAG:MAG:BINNING:METABAT2_JGISUMMARIZEBAMCONTIGDEPTHS (NG-30689_QN1_4_3_lib613328_10075_2)                                          [100%] 2 of 2, cached: 2 ✔
[e7/d37f31] process > NFCORE_MAG:MAG:BINNING:CONVERT_DEPTHS (NG-30689_QN1_4_3_lib613328_10075_2)                                                                [100%] 2 of 2, cached: 2 ✔
[87/0a8ee1] process > NFCORE_MAG:MAG:BINNING:METABAT2_METABAT2 (NG-30689_QN1_4_3_lib613328_10075_2)                                                             [100%] 2 of 2, cached: 2 ✔
[54/c0b9eb] process > NFCORE_MAG:MAG:BINNING:MAXBIN2 (NG-30689_QN1_4_3_lib613328_10075_2)                                                                       [100%] 2 of 2, cached: 2 ✔
[2f/482f33] process > NFCORE_MAG:MAG:BINNING:ADJUST_MAXBIN2_EXT (MEGAHIT-NG-30689_QN1_4_3_lib613328_10075_2)                                                    [100%] 2 of 2, cached: 2 ✔
[f9/a7820f] process > NFCORE_MAG:MAG:BINNING:SPLIT_FASTA (MEGAHIT-MaxBin2-NG-30689_QN1_4_3_lib613328_10075_2)                                                   [100%] 4 of 4, cached: 4 ✔
[af/95fa5c] process > NFCORE_MAG:MAG:BINNING:GUNZIP_BINS (MEGAHIT-MaxBin2-NG-30689_QN1_4_3_lib613328_10075_2.023.fa.gz)                                         [100%] 106 of 106, cached: 106 ✔
[-        ] process > NFCORE_MAG:MAG:BINNING:GUNZIP_UNBINS                                                                                                      -
[b3/6f4c47] process > NFCORE_MAG:MAG:BINNING:MAG_DEPTHS (MEGAHIT-MaxBin2-NG-30689_QN1_4_3_lib613328_10075_2)                                                    [100%] 4 of 4, cached: 4 ✔
[-        ] process > NFCORE_MAG:MAG:BINNING:MAG_DEPTHS_PLOT                                                                                                    -
[2f/5d9547] process > NFCORE_MAG:MAG:BINNING:MAG_DEPTHS_SUMMARY                                                                                                 [100%] 1 of 1, cached: 1 ✔
[0c/ee7390] process > NFCORE_MAG:MAG:BUSCO_QC:BUSCO (MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.13.fa)                                                 [  0%] 0 of 106
[-        ] process > NFCORE_MAG:MAG:BUSCO_QC:BUSCO_PLOT                                                                                                        -
[-        ] process > NFCORE_MAG:MAG:BUSCO_QC:BUSCO_SUMMARY                                                                                                     -
[1f/0f9fa9] process > NFCORE_MAG:MAG:QUAST_BINS (SPAdes-MaxBin2-NG-30689_QN1_4_3_lib613328_10075_2)                                                             [100%] 4 of 4, cached: 4 ✔
[04/96c715] process > NFCORE_MAG:MAG:QUAST_BINS_SUMMARY                                                                                                         [100%] 1 of 1, cached: 1 ✔
[-        ] process > NFCORE_MAG:MAG:CAT                                                                                                                        -
[d3/503ebe] process > NFCORE_MAG:MAG:GTDBTK:GTDBTK_DB_PREPARATION (gtdbtk_r202_data.tar.gz)                                                                     [100%] 1 of 1, cached: 1 ✔
[-        ] process > NFCORE_MAG:MAG:GTDBTK:GTDBTK_CLASSIFY                                                                                                     -
[-        ] process > NFCORE_MAG:MAG:GTDBTK:GTDBTK_SUMMARY                                                                                                      -
[-        ] process > NFCORE_MAG:MAG:BIN_SUMMARY                                                                                                                -
[37/3b84e6] process > NFCORE_MAG:MAG:PROKKA (MEGAHIT-MaxBin2-NG-30689_QN1_4_3_lib613328_10075_2.017)                                                            [ 94%] 100 of 106, cached: 100
[-        ] process > NFCORE_MAG:MAG:CUSTOM_DUMPSOFTWAREVERSIONS                                                                                                -
[-        ] process > NFCORE_MAG:MAG:MULTIQC                                                                                                                    -
Error executing process > 'NFCORE_MAG:MAG:BUSCO_QC:BUSCO (MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa)'

Caused by:
  Process `NFCORE_MAG:MAG:BUSCO_QC:BUSCO (MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa)` terminated with an error exit status (1)

Command executed:

  # ensure augustus has write access to config directory
  if [ N = "Y" ] ; then
      cp -r /usr/local/config/ augustus_config/
      export AUGUSTUS_CONFIG_PATH=augustus_config
  fi
  
  # place db in extra folder to ensure BUSCO recognizes it as path (instead of downloading it)
  if [ N = "Y" ] ; then
      mkdir dataset
      mv  dataset/
  fi
  
  # set nullgob: if pattern matches no files, expand to a null string rather than to itself
  shopt -s nullglob
  
  # only used for saving busco downloads
  most_spec_db="NA"
  
  if busco --auto-lineage         --mode genome         --in MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa         --cpu "8"         --out "BUSCO" > MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log 2> MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.err; then
  
      # get name of used specific lineage dataset
      summaries=(BUSCO/short_summary.specific.*.BUSCO.txt)
      if [ ${#summaries[@]} -ne 1 ]; then
          echo "ERROR: none or multiple 'BUSCO/short_summary.specific.*.BUSCO.txt' files found. Expected one."
          exit 1
      fi
      [[ $summaries =~ BUSCO/short_summary.specific.(.*).BUSCO.txt ]];
      db_name_spec="${BASH_REMATCH[1]}"
      most_spec_db=${db_name_spec}
      echo "Used specific lineage dataset: ${db_name_spec}"
  
      if [ N = "Y" ]; then
          cp BUSCO/short_summary.specific.${db_name_spec}.BUSCO.txt short_summary.specific_lineage.${db_name_spec}.MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa.txt
  
          # if lineage dataset is provided, BUSCO analysis does not fail in case no genes can be found as when using the auto selection setting
          # report bin as failed to allow consistent warnings within the pipeline for both settings
          if egrep -q $'WARNING:	BUSCO did not find any match.' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log ; then
              echo "WARNING: BUSCO could not find any genes for the provided lineage dataset! See also MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log."
              echo -e "MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa	No genes" > "MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.failed_bin.txt"
          fi
      else
          # auto lineage selection
          if { egrep -q $'INFO:	\S+ selected' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log                 && egrep -q $'INFO:	Lineage \S+ is selected, supported by ' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log ; } ||                 { egrep -q $'INFO:	\S+ selected' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log                 && egrep -q $'INFO:	The results from the Prodigal gene predictor indicate that your data belongs to the mollicutes clade. Testing subclades...' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log                 && egrep -q $'INFO:	Using local lineages directory ' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log ; }; then
              # the second statement is necessary, because certain mollicute clades use a different genetic code, are not part of the BUSCO placement tree, are tested separately
              # and cause different log messages
              echo "Domain and specific lineage could be selected by BUSCO."
              cp BUSCO/short_summary.specific.${db_name_spec}.BUSCO.txt short_summary.specific_lineage.${db_name_spec}.MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa.txt
  
              db_name_gen=""
              summaries_gen=(BUSCO/short_summary.generic.*.BUSCO.txt)
              if [ ${#summaries_gen[@]} -lt 1 ]; then
                  echo "No 'BUSCO/short_summary.generic.*.BUSCO.txt' file found. Assuming selected domain and specific lineages are the same."
                  cp BUSCO/short_summary.specific.${db_name_spec}.BUSCO.txt short_summary.domain.${db_name_spec}.MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa.txt
                  db_name_gen=${db_name_spec}
              else
                  [[ $summaries_gen =~ BUSCO/short_summary.generic.(.*).BUSCO.txt ]];
                  db_name_gen="${BASH_REMATCH[1]}"
                  echo "Used generic lineage dataset: ${db_name_gen}"
                  cp BUSCO/short_summary.generic.${db_name_gen}.BUSCO.txt short_summary.domain.${db_name_gen}.MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa.txt
              fi
  
              for f in BUSCO/run_${db_name_gen}/busco_sequences/single_copy_busco_sequences/*faa; do
                  cat BUSCO/run_${db_name_gen}/busco_sequences/single_copy_busco_sequences/*faa | gzip >MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_buscos.${db_name_gen}.faa.gz
                  break
              done
              for f in BUSCO/run_${db_name_gen}/busco_sequences/single_copy_busco_sequences/*fna; do
                  cat BUSCO/run_${db_name_gen}/busco_sequences/single_copy_busco_sequences/*fna | gzip >MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_buscos.${db_name_gen}.fna.gz
                  break
              done
  
          elif egrep -q $'INFO:	\S+ selected' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log && egrep -q $'INFO:	Not enough markers were placed on the tree \([0-9]*\). Root lineage \S+ is kept' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log ; then
              echo "Domain could be selected by BUSCO, but no more specific lineage."
              cp BUSCO/short_summary.specific.${db_name_spec}.BUSCO.txt short_summary.domain.${db_name_spec}.MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa.txt
  
          elif egrep -q $'INFO:	\S+ selected' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log && egrep -q $'INFO:	Running virus detection pipeline' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log ; then
              # TODO double-check if selected dataset is not one of bacteria_*, archaea_*, eukaryota_*?
              echo "Domain could not be selected by BUSCO, but virus dataset was selected."
              cp BUSCO/short_summary.specific.${db_name_spec}.BUSCO.txt short_summary.specific_lineage.${db_name_spec}.MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa.txt
          else
              echo "ERROR: Some not expected case occurred! See MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log." >&2
              exit 1
          fi
      fi
  
      for f in BUSCO/run_${db_name_spec}/busco_sequences/single_copy_busco_sequences/*faa; do
          cat BUSCO/run_${db_name_spec}/busco_sequences/single_copy_busco_sequences/*faa | gzip >MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_buscos.${db_name_spec}.faa.gz
          break
      done
      for f in BUSCO/run_${db_name_spec}/busco_sequences/single_copy_busco_sequences/*fna; do
          cat BUSCO/run_${db_name_spec}/busco_sequences/single_copy_busco_sequences/*fna | gzip >MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_buscos.${db_name_spec}.fna.gz
          break
      done
  
  elif egrep -q $'ERROR:	No genes were recognized by BUSCO' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.err ; then
      echo "WARNING: BUSCO analysis failed due to no recognized genes! See also MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.err."
      echo -e "MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa	No genes" > "MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.failed_bin.txt"
  
  elif egrep -q $'INFO:	\S+ selected' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log && egrep -q $'ERROR:	Placements failed' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.err ; then
executor >  local (7)
[47/9be65c] process > NFCORE_MAG:MAG:FASTQC_RAW (NG-30689_QN1_4_3_lib613328_10075_2)                                                                            [100%] 1 of 1, cached: 1 ✔
[3d/396de6] process > NFCORE_MAG:MAG:FASTP (NG-30689_QN1_4_3_lib613328_10075_2)                                                                                 [100%] 1 of 1, cached: 1 ✔
[ac/adbb55] process > NFCORE_MAG:MAG:BOWTIE2_PHIX_REMOVAL_BUILD (GCA_002596845.1_ASM259684v1_genomic.fna.gz)                                                    [100%] 1 of 1, cached: 1 ✔
[16/88f95a] process > NFCORE_MAG:MAG:BOWTIE2_PHIX_REMOVAL_ALIGN (NG-30689_QN1_4_3_lib613328_10075_2)                                                            [100%] 1 of 1, cached: 1 ✔
[9b/ec6fb9] process > NFCORE_MAG:MAG:FASTQC_TRIMMED (NG-30689_QN1_4_3_lib613328_10075_2)                                                                        [100%] 1 of 1, cached: 1 ✔
[-        ] process > NFCORE_MAG:MAG:NANOPLOT_RAW                                                                                                               -
[-        ] process > NFCORE_MAG:MAG:PORECHOP                                                                                                                   -
[-        ] process > NFCORE_MAG:MAG:NANOLYSE                                                                                                                   -
[-        ] process > NFCORE_MAG:MAG:FILTLONG                                                                                                                   -
[-        ] process > NFCORE_MAG:MAG:NANOPLOT_FILTERED                                                                                                          -
[-        ] process > NFCORE_MAG:MAG:CENTRIFUGE_DB_PREPARATION                                                                                                  -
[-        ] process > NFCORE_MAG:MAG:CENTRIFUGE                                                                                                                 -
[-        ] process > NFCORE_MAG:MAG:KRAKEN2_DB_PREPARATION                                                                                                     -
[-        ] process > NFCORE_MAG:MAG:KRAKEN2                                                                                                                    -
[37/8a2ffc] process > NFCORE_MAG:MAG:MEGAHIT (NG-30689_QN1_4_3_lib613328_10075_2)                                                                               [100%] 1 of 1, cached: 1 ✔
[8a/bf0dd1] process > NFCORE_MAG:MAG:SPADES (NG-30689_QN1_4_3_lib613328_10075_2)                                                                                [100%] 1 of 1, cached: 1 ✔
[-        ] process > NFCORE_MAG:MAG:SPADESHYBRID                                                                                                               -
[3c/1903eb] process > NFCORE_MAG:MAG:QUAST (MEGAHIT-NG-30689_QN1_4_3_lib613328_10075_2)                                                                         [100%] 2 of 2, cached: 2 ✔
[6b/450699] process > NFCORE_MAG:MAG:PRODIGAL (NG-30689_QN1_4_3_lib613328_10075_2)                                                                              [100%] 2 of 2, cached: 2 ✔
[bd/0fff10] process > NFCORE_MAG:MAG:BINNING_PREPARATION:BOWTIE2_ASSEMBLY_BUILD (MEGAHIT-NG-30689_QN1_4_3_lib613328_10075_2)                                    [100%] 2 of 2, cached: 2 ✔
[ff/266e2f] process > NFCORE_MAG:MAG:BINNING_PREPARATION:BOWTIE2_ASSEMBLY_ALIGN (MEGAHIT-NG-30689_QN1_4_3_lib613328_10075_2-NG-30689_QN1_4_3_lib613328_10075_2) [100%] 2 of 2, cached: 2 ✔
[cd/528041] process > NFCORE_MAG:MAG:BINNING:METABAT2_JGISUMMARIZEBAMCONTIGDEPTHS (NG-30689_QN1_4_3_lib613328_10075_2)                                          [100%] 2 of 2, cached: 2 ✔
[e7/d37f31] process > NFCORE_MAG:MAG:BINNING:CONVERT_DEPTHS (NG-30689_QN1_4_3_lib613328_10075_2)                                                                [100%] 2 of 2, cached: 2 ✔
[87/0a8ee1] process > NFCORE_MAG:MAG:BINNING:METABAT2_METABAT2 (NG-30689_QN1_4_3_lib613328_10075_2)                                                             [100%] 2 of 2, cached: 2 ✔
[54/c0b9eb] process > NFCORE_MAG:MAG:BINNING:MAXBIN2 (NG-30689_QN1_4_3_lib613328_10075_2)                                                                       [100%] 2 of 2, cached: 2 ✔
[2f/482f33] process > NFCORE_MAG:MAG:BINNING:ADJUST_MAXBIN2_EXT (MEGAHIT-NG-30689_QN1_4_3_lib613328_10075_2)                                                    [100%] 2 of 2, cached: 2 ✔
[f9/a7820f] process > NFCORE_MAG:MAG:BINNING:SPLIT_FASTA (MEGAHIT-MaxBin2-NG-30689_QN1_4_3_lib613328_10075_2)                                                   [100%] 4 of 4, cached: 4 ✔
[af/95fa5c] process > NFCORE_MAG:MAG:BINNING:GUNZIP_BINS (MEGAHIT-MaxBin2-NG-30689_QN1_4_3_lib613328_10075_2.023.fa.gz)                                         [100%] 106 of 106, cached: 106 ✔
[-        ] process > NFCORE_MAG:MAG:BINNING:GUNZIP_UNBINS                                                                                                      -
[b3/6f4c47] process > NFCORE_MAG:MAG:BINNING:MAG_DEPTHS (MEGAHIT-MaxBin2-NG-30689_QN1_4_3_lib613328_10075_2)                                                    [100%] 4 of 4, cached: 4 ✔
[-        ] process > NFCORE_MAG:MAG:BINNING:MAG_DEPTHS_PLOT                                                                                                    -
[2f/5d9547] process > NFCORE_MAG:MAG:BINNING:MAG_DEPTHS_SUMMARY                                                                                                 [100%] 1 of 1, cached: 1 ✔
[c2/76795c] process > NFCORE_MAG:MAG:BUSCO_QC:BUSCO (MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa)                                                 [  0%] 1 of 106, failed: 1
[-        ] process > NFCORE_MAG:MAG:BUSCO_QC:BUSCO_PLOT                                                                                                        -
[-        ] process > NFCORE_MAG:MAG:BUSCO_QC:BUSCO_SUMMARY                                                                                                     -
[1f/0f9fa9] process > NFCORE_MAG:MAG:QUAST_BINS (SPAdes-MaxBin2-NG-30689_QN1_4_3_lib613328_10075_2)                                                             [100%] 4 of 4, cached: 4 ✔
[04/96c715] process > NFCORE_MAG:MAG:QUAST_BINS_SUMMARY                                                                                                         [100%] 1 of 1, cached: 1 ✔
[-        ] process > NFCORE_MAG:MAG:CAT                                                                                                                        -
[d3/503ebe] process > NFCORE_MAG:MAG:GTDBTK:GTDBTK_DB_PREPARATION (gtdbtk_r202_data.tar.gz)                                                                     [100%] 1 of 1, cached: 1 ✔
[-        ] process > NFCORE_MAG:MAG:GTDBTK:GTDBTK_CLASSIFY                                                                                                     -
[-        ] process > NFCORE_MAG:MAG:GTDBTK:GTDBTK_SUMMARY                                                                                                      -
[-        ] process > NFCORE_MAG:MAG:BIN_SUMMARY                                                                                                                -
[37/3b84e6] process > NFCORE_MAG:MAG:PROKKA (MEGAHIT-MaxBin2-NG-30689_QN1_4_3_lib613328_10075_2.017)                                                            [ 94%] 100 of 106, cached: 100
[-        ] process > NFCORE_MAG:MAG:CUSTOM_DUMPSOFTWAREVERSIONS                                                                                                -
[-        ] process > NFCORE_MAG:MAG:MULTIQC                                                                                                                    -
Execution cancelled -- Finishing pending tasks before exit
Error executing process > 'NFCORE_MAG:MAG:BUSCO_QC:BUSCO (MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa)'

Caused by:
  Process `NFCORE_MAG:MAG:BUSCO_QC:BUSCO (MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa)` terminated with an error exit status (1)

Command executed:

  # ensure augustus has write access to config directory
  if [ N = "Y" ] ; then
      cp -r /usr/local/config/ augustus_config/
      export AUGUSTUS_CONFIG_PATH=augustus_config
  fi
  
  # place db in extra folder to ensure BUSCO recognizes it as path (instead of downloading it)
  if [ N = "Y" ] ; then
      mkdir dataset
      mv  dataset/
  fi
  
  # set nullgob: if pattern matches no files, expand to a null string rather than to itself
  shopt -s nullglob
  
  # only used for saving busco downloads
  most_spec_db="NA"
  
  if busco --auto-lineage         --mode genome         --in MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa         --cpu "8"         --out "BUSCO" > MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log 2> MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.err; then
  
      # get name of used specific lineage dataset
      summaries=(BUSCO/short_summary.specific.*.BUSCO.txt)
      if [ ${#summaries[@]} -ne 1 ]; then
          echo "ERROR: none or multiple 'BUSCO/short_summary.specific.*.BUSCO.txt' files found. Expected one."
          exit 1
      fi
      [[ $summaries =~ BUSCO/short_summary.specific.(.*).BUSCO.txt ]];
      db_name_spec="${BASH_REMATCH[1]}"
      most_spec_db=${db_name_spec}
      echo "Used specific lineage dataset: ${db_name_spec}"
  
      if [ N = "Y" ]; then
          cp BUSCO/short_summary.specific.${db_name_spec}.BUSCO.txt short_summary.specific_lineage.${db_name_spec}.MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa.txt
  
          # if lineage dataset is provided, BUSCO analysis does not fail in case no genes can be found as when using the auto selection setting
          # report bin as failed to allow consistent warnings within the pipeline for both settings
          if egrep -q $'WARNING:	BUSCO did not find any match.' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log ; then
              echo "WARNING: BUSCO could not find any genes for the provided lineage dataset! See also MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log."
              echo -e "MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa	No genes" > "MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.failed_bin.txt"
          fi
      else
          # auto lineage selection
          if { egrep -q $'INFO:	\S+ selected' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log                 && egrep -q $'INFO:	Lineage \S+ is selected, supported by ' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log ; } ||                 { egrep -q $'INFO:	\S+ selected' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log                 && egrep -q $'INFO:	The results from the Prodigal gene predictor indicate that your data belongs to the mollicutes clade. Testing subclades...' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log                 && egrep -q $'INFO:	Using local lineages directory ' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log ; }; then
              # the second statement is necessary, because certain mollicute clades use a different genetic code, are not part of the BUSCO placement tree, are tested separately
              # and cause different log messages
              echo "Domain and specific lineage could be selected by BUSCO."
              cp BUSCO/short_summary.specific.${db_name_spec}.BUSCO.txt short_summary.specific_lineage.${db_name_spec}.MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa.txt
  
              db_name_gen=""
              summaries_gen=(BUSCO/short_summary.generic.*.BUSCO.txt)
              if [ ${#summaries_gen[@]} -lt 1 ]; then
                  echo "No 'BUSCO/short_summary.generic.*.BUSCO.txt' file found. Assuming selected domain and specific lineages are the same."
                  cp BUSCO/short_summary.specific.${db_name_spec}.BUSCO.txt short_summary.domain.${db_name_spec}.MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa.txt
                  db_name_gen=${db_name_spec}
              else
                  [[ $summaries_gen =~ BUSCO/short_summary.generic.(.*).BUSCO.txt ]];
                  db_name_gen="${BASH_REMATCH[1]}"
                  echo "Used generic lineage dataset: ${db_name_gen}"
                  cp BUSCO/short_summary.generic.${db_name_gen}.BUSCO.txt short_summary.domain.${db_name_gen}.MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa.txt
              fi
  
              for f in BUSCO/run_${db_name_gen}/busco_sequences/single_copy_busco_sequences/*faa; do
                  cat BUSCO/run_${db_name_gen}/busco_sequences/single_copy_busco_sequences/*faa | gzip >MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_buscos.${db_name_gen}.faa.gz
                  break
              done
              for f in BUSCO/run_${db_name_gen}/busco_sequences/single_copy_busco_sequences/*fna; do
                  cat BUSCO/run_${db_name_gen}/busco_sequences/single_copy_busco_sequences/*fna | gzip >MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_buscos.${db_name_gen}.fna.gz
                  break
              done
  
          elif egrep -q $'INFO:	\S+ selected' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log && egrep -q $'INFO:	Not enough markers were placed on the tree \([0-9]*\). Root lineage \S+ is kept' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log ; then
              echo "Domain could be selected by BUSCO, but no more specific lineage."
              cp BUSCO/short_summary.specific.${db_name_spec}.BUSCO.txt short_summary.domain.${db_name_spec}.MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa.txt
  
          elif egrep -q $'INFO:	\S+ selected' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log && egrep -q $'INFO:	Running virus detection pipeline' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log ; then
              # TODO double-check if selected dataset is not one of bacteria_*, archaea_*, eukaryota_*?
              echo "Domain could not be selected by BUSCO, but virus dataset was selected."
              cp BUSCO/short_summary.specific.${db_name_spec}.BUSCO.txt short_summary.specific_lineage.${db_name_spec}.MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa.txt
          else
              echo "ERROR: Some not expected case occurred! See MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log." >&2
              exit 1
          fi
      fi
  
      for f in BUSCO/run_${db_name_spec}/busco_sequences/single_copy_busco_sequences/*faa; do
          cat BUSCO/run_${db_name_spec}/busco_sequences/single_copy_busco_sequences/*faa | gzip >MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_buscos.${db_name_spec}.faa.gz
          break
      done
      for f in BUSCO/run_${db_name_spec}/busco_sequences/single_copy_busco_sequences/*fna; do
          cat BUSCO/run_${db_name_spec}/busco_sequences/single_copy_busco_sequences/*fna | gzip >MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_buscos.${db_name_spec}.fna.gz
          break
      done
  
  elif egrep -q $'ERROR:	No genes were recognized by BUSCO' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.err ; then
      echo "WARNING: BUSCO analysis failed due to no recognized genes! See also MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.err."
      echo -e "MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa	No genes" > "MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.failed_bin.txt"
  
  elif egrep -q $'INFO:	\S+ selected' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log && egrep -q $'ERROR:	Placements failed' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.err ; then
      echo "WARNING: BUSCO analysis failed due to failed placements! See also MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.err. Still using results for selected generic lineage dataset."
      echo -e "MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa	Placements failed" > "MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.failed_bin.txt"
  
      message=$(egrep $'INFO:	\S+ selected' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log)
      [[ $message =~ INFO:[[:space:]]([_[:alnum:]]+)[[:space:]]selected ]];
      db_name_gen="${BASH_REMATCH[1]}"
      most_spec_db=${db_name_gen}
      echo "Used generic lineage dataset: ${db_name_gen}"
      cp BUSCO/auto_lineage/run_${db_name_gen}/short_summary.txt short_summary.domain.${db_name_gen}.MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa.txt
  
      for f in BUSCO/auto_lineage/run_${db_name_gen}/busco_sequences/single_copy_busco_sequences/*faa; do
          cat BUSCO/auto_lineage/run_${db_name_gen}/busco_sequences/single_copy_busco_sequences/*faa | gzip >MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_buscos.${db_name_gen}.faa.gz
          break
      done
      for f in BUSCO/auto_lineage/run_${db_name_gen}/busco_sequences/single_copy_busco_sequences/*fna; do
          cat BUSCO/auto_lineage/run_${db_name_gen}/busco_sequences/single_copy_busco_sequences/*fna | gzip >MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_buscos.${db_name_gen}.fna.gz
          break
executor >  local (7)
[47/9be65c] process > NFCORE_MAG:MAG:FASTQC_RAW (NG-30689_QN1_4_3_lib613328_10075_2)                                                                            [100%] 1 of 1, cached: 1 ✔
[3d/396de6] process > NFCORE_MAG:MAG:FASTP (NG-30689_QN1_4_3_lib613328_10075_2)                                                                                 [100%] 1 of 1, cached: 1 ✔
[ac/adbb55] process > NFCORE_MAG:MAG:BOWTIE2_PHIX_REMOVAL_BUILD (GCA_002596845.1_ASM259684v1_genomic.fna.gz)                                                    [100%] 1 of 1, cached: 1 ✔
[16/88f95a] process > NFCORE_MAG:MAG:BOWTIE2_PHIX_REMOVAL_ALIGN (NG-30689_QN1_4_3_lib613328_10075_2)                                                            [100%] 1 of 1, cached: 1 ✔
[9b/ec6fb9] process > NFCORE_MAG:MAG:FASTQC_TRIMMED (NG-30689_QN1_4_3_lib613328_10075_2)                                                                        [100%] 1 of 1, cached: 1 ✔
[-        ] process > NFCORE_MAG:MAG:NANOPLOT_RAW                                                                                                               -
[-        ] process > NFCORE_MAG:MAG:PORECHOP                                                                                                                   -
[-        ] process > NFCORE_MAG:MAG:NANOLYSE                                                                                                                   -
[-        ] process > NFCORE_MAG:MAG:FILTLONG                                                                                                                   -
[-        ] process > NFCORE_MAG:MAG:NANOPLOT_FILTERED                                                                                                          -
[-        ] process > NFCORE_MAG:MAG:CENTRIFUGE_DB_PREPARATION                                                                                                  -
[-        ] process > NFCORE_MAG:MAG:CENTRIFUGE                                                                                                                 -
[-        ] process > NFCORE_MAG:MAG:KRAKEN2_DB_PREPARATION                                                                                                     -
[-        ] process > NFCORE_MAG:MAG:KRAKEN2                                                                                                                    -
[37/8a2ffc] process > NFCORE_MAG:MAG:MEGAHIT (NG-30689_QN1_4_3_lib613328_10075_2)                                                                               [100%] 1 of 1, cached: 1 ✔
[8a/bf0dd1] process > NFCORE_MAG:MAG:SPADES (NG-30689_QN1_4_3_lib613328_10075_2)                                                                                [100%] 1 of 1, cached: 1 ✔
[-        ] process > NFCORE_MAG:MAG:SPADESHYBRID                                                                                                               -
[3c/1903eb] process > NFCORE_MAG:MAG:QUAST (MEGAHIT-NG-30689_QN1_4_3_lib613328_10075_2)                                                                         [100%] 2 of 2, cached: 2 ✔
[6b/450699] process > NFCORE_MAG:MAG:PRODIGAL (NG-30689_QN1_4_3_lib613328_10075_2)                                                                              [100%] 2 of 2, cached: 2 ✔
[bd/0fff10] process > NFCORE_MAG:MAG:BINNING_PREPARATION:BOWTIE2_ASSEMBLY_BUILD (MEGAHIT-NG-30689_QN1_4_3_lib613328_10075_2)                                    [100%] 2 of 2, cached: 2 ✔
[ff/266e2f] process > NFCORE_MAG:MAG:BINNING_PREPARATION:BOWTIE2_ASSEMBLY_ALIGN (MEGAHIT-NG-30689_QN1_4_3_lib613328_10075_2-NG-30689_QN1_4_3_lib613328_10075_2) [100%] 2 of 2, cached: 2 ✔
[cd/528041] process > NFCORE_MAG:MAG:BINNING:METABAT2_JGISUMMARIZEBAMCONTIGDEPTHS (NG-30689_QN1_4_3_lib613328_10075_2)                                          [100%] 2 of 2, cached: 2 ✔
[e7/d37f31] process > NFCORE_MAG:MAG:BINNING:CONVERT_DEPTHS (NG-30689_QN1_4_3_lib613328_10075_2)                                                                [100%] 2 of 2, cached: 2 ✔
[87/0a8ee1] process > NFCORE_MAG:MAG:BINNING:METABAT2_METABAT2 (NG-30689_QN1_4_3_lib613328_10075_2)                                                             [100%] 2 of 2, cached: 2 ✔
[54/c0b9eb] process > NFCORE_MAG:MAG:BINNING:MAXBIN2 (NG-30689_QN1_4_3_lib613328_10075_2)                                                                       [100%] 2 of 2, cached: 2 ✔
[2f/482f33] process > NFCORE_MAG:MAG:BINNING:ADJUST_MAXBIN2_EXT (MEGAHIT-NG-30689_QN1_4_3_lib613328_10075_2)                                                    [100%] 2 of 2, cached: 2 ✔
[f9/a7820f] process > NFCORE_MAG:MAG:BINNING:SPLIT_FASTA (MEGAHIT-MaxBin2-NG-30689_QN1_4_3_lib613328_10075_2)                                                   [100%] 4 of 4, cached: 4 ✔
[af/95fa5c] process > NFCORE_MAG:MAG:BINNING:GUNZIP_BINS (MEGAHIT-MaxBin2-NG-30689_QN1_4_3_lib613328_10075_2.023.fa.gz)                                         [100%] 106 of 106, cached: 106 ✔
[-        ] process > NFCORE_MAG:MAG:BINNING:GUNZIP_UNBINS                                                                                                      -
[b3/6f4c47] process > NFCORE_MAG:MAG:BINNING:MAG_DEPTHS (MEGAHIT-MaxBin2-NG-30689_QN1_4_3_lib613328_10075_2)                                                    [100%] 4 of 4, cached: 4 ✔
[-        ] process > NFCORE_MAG:MAG:BINNING:MAG_DEPTHS_PLOT                                                                                                    -
[2f/5d9547] process > NFCORE_MAG:MAG:BINNING:MAG_DEPTHS_SUMMARY                                                                                                 [100%] 1 of 1, cached: 1 ✔
[0c/ee7390] process > NFCORE_MAG:MAG:BUSCO_QC:BUSCO (MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.13.fa)                                                 [  1%] 1 of 100, failed: 1
[-        ] process > NFCORE_MAG:MAG:BUSCO_QC:BUSCO_PLOT                                                                                                        -
[-        ] process > NFCORE_MAG:MAG:BUSCO_QC:BUSCO_SUMMARY                                                                                                     -
[1f/0f9fa9] process > NFCORE_MAG:MAG:QUAST_BINS (SPAdes-MaxBin2-NG-30689_QN1_4_3_lib613328_10075_2)                                                             [100%] 4 of 4, cached: 4 ✔
[04/96c715] process > NFCORE_MAG:MAG:QUAST_BINS_SUMMARY                                                                                                         [100%] 1 of 1, cached: 1 ✔
[-        ] process > NFCORE_MAG:MAG:CAT                                                                                                                        -
[d3/503ebe] process > NFCORE_MAG:MAG:GTDBTK:GTDBTK_DB_PREPARATION (gtdbtk_r202_data.tar.gz)                                                                     [100%] 1 of 1, cached: 1 ✔
[-        ] process > NFCORE_MAG:MAG:GTDBTK:GTDBTK_CLASSIFY                                                                                                     -
[-        ] process > NFCORE_MAG:MAG:GTDBTK:GTDBTK_SUMMARY                                                                                                      -
[-        ] process > NFCORE_MAG:MAG:BIN_SUMMARY                                                                                                                -
[37/3b84e6] process > NFCORE_MAG:MAG:PROKKA (MEGAHIT-MaxBin2-NG-30689_QN1_4_3_lib613328_10075_2.017)                                                            [ 94%] 100 of 106, cached: 100
[-        ] process > NFCORE_MAG:MAG:CUSTOM_DUMPSOFTWAREVERSIONS                                                                                                -
[-        ] process > NFCORE_MAG:MAG:MULTIQC                                                                                                                    -
Execution cancelled -- Finishing pending tasks before exit
-[nf-core/mag] Pipeline completed with errors-
WARN: Killing running tasks (6)
Error executing process > 'NFCORE_MAG:MAG:BUSCO_QC:BUSCO (MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa)'

Caused by:
  Process `NFCORE_MAG:MAG:BUSCO_QC:BUSCO (MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa)` terminated with an error exit status (1)

Command executed:

  # ensure augustus has write access to config directory
  if [ N = "Y" ] ; then
      cp -r /usr/local/config/ augustus_config/
      export AUGUSTUS_CONFIG_PATH=augustus_config
  fi
  
  # place db in extra folder to ensure BUSCO recognizes it as path (instead of downloading it)
  if [ N = "Y" ] ; then
      mkdir dataset
      mv  dataset/
  fi
  
  # set nullgob: if pattern matches no files, expand to a null string rather than to itself
  shopt -s nullglob
  
  # only used for saving busco downloads
  most_spec_db="NA"
  
  if busco --auto-lineage         --mode genome         --in MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa         --cpu "8"         --out "BUSCO" > MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log 2> MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.err; then
  
      # get name of used specific lineage dataset
      summaries=(BUSCO/short_summary.specific.*.BUSCO.txt)
      if [ ${#summaries[@]} -ne 1 ]; then
          echo "ERROR: none or multiple 'BUSCO/short_summary.specific.*.BUSCO.txt' files found. Expected one."
          exit 1
      fi
      [[ $summaries =~ BUSCO/short_summary.specific.(.*).BUSCO.txt ]];
      db_name_spec="${BASH_REMATCH[1]}"
      most_spec_db=${db_name_spec}
      echo "Used specific lineage dataset: ${db_name_spec}"
  
      if [ N = "Y" ]; then
          cp BUSCO/short_summary.specific.${db_name_spec}.BUSCO.txt short_summary.specific_lineage.${db_name_spec}.MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa.txt
  
          # if lineage dataset is provided, BUSCO analysis does not fail in case no genes can be found as when using the auto selection setting
          # report bin as failed to allow consistent warnings within the pipeline for both settings
          if egrep -q $'WARNING:	BUSCO did not find any match.' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log ; then
              echo "WARNING: BUSCO could not find any genes for the provided lineage dataset! See also MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log."
              echo -e "MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa	No genes" > "MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.failed_bin.txt"
          fi
      else
          # auto lineage selection
          if { egrep -q $'INFO:	\S+ selected' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log                 && egrep -q $'INFO:	Lineage \S+ is selected, supported by ' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log ; } ||                 { egrep -q $'INFO:	\S+ selected' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log                 && egrep -q $'INFO:	The results from the Prodigal gene predictor indicate that your data belongs to the mollicutes clade. Testing subclades...' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log                 && egrep -q $'INFO:	Using local lineages directory ' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log ; }; then
              # the second statement is necessary, because certain mollicute clades use a different genetic code, are not part of the BUSCO placement tree, are tested separately
              # and cause different log messages
              echo "Domain and specific lineage could be selected by BUSCO."
              cp BUSCO/short_summary.specific.${db_name_spec}.BUSCO.txt short_summary.specific_lineage.${db_name_spec}.MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa.txt
  
              db_name_gen=""
              summaries_gen=(BUSCO/short_summary.generic.*.BUSCO.txt)
              if [ ${#summaries_gen[@]} -lt 1 ]; then
                  echo "No 'BUSCO/short_summary.generic.*.BUSCO.txt' file found. Assuming selected domain and specific lineages are the same."
                  cp BUSCO/short_summary.specific.${db_name_spec}.BUSCO.txt short_summary.domain.${db_name_spec}.MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa.txt
                  db_name_gen=${db_name_spec}
              else
                  [[ $summaries_gen =~ BUSCO/short_summary.generic.(.*).BUSCO.txt ]];
                  db_name_gen="${BASH_REMATCH[1]}"
                  echo "Used generic lineage dataset: ${db_name_gen}"
                  cp BUSCO/short_summary.generic.${db_name_gen}.BUSCO.txt short_summary.domain.${db_name_gen}.MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa.txt
              fi
  
              for f in BUSCO/run_${db_name_gen}/busco_sequences/single_copy_busco_sequences/*faa; do
                  cat BUSCO/run_${db_name_gen}/busco_sequences/single_copy_busco_sequences/*faa | gzip >MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_buscos.${db_name_gen}.faa.gz
                  break
              done
              for f in BUSCO/run_${db_name_gen}/busco_sequences/single_copy_busco_sequences/*fna; do
                  cat BUSCO/run_${db_name_gen}/busco_sequences/single_copy_busco_sequences/*fna | gzip >MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_buscos.${db_name_gen}.fna.gz
                  break
              done
  
          elif egrep -q $'INFO:	\S+ selected' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log && egrep -q $'INFO:	Not enough markers were placed on the tree \([0-9]*\). Root lineage \S+ is kept' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log ; then
              echo "Domain could be selected by BUSCO, but no more specific lineage."
              cp BUSCO/short_summary.specific.${db_name_spec}.BUSCO.txt short_summary.domain.${db_name_spec}.MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa.txt
  
          elif egrep -q $'INFO:	\S+ selected' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log && egrep -q $'INFO:	Running virus detection pipeline' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log ; then
              # TODO double-check if selected dataset is not one of bacteria_*, archaea_*, eukaryota_*?
              echo "Domain could not be selected by BUSCO, but virus dataset was selected."
              cp BUSCO/short_summary.specific.${db_name_spec}.BUSCO.txt short_summary.specific_lineage.${db_name_spec}.MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa.txt
          else
              echo "ERROR: Some not expected case occurred! See MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log." >&2
              exit 1
          fi
      fi
  
      for f in BUSCO/run_${db_name_spec}/busco_sequences/single_copy_busco_sequences/*faa; do
          cat BUSCO/run_${db_name_spec}/busco_sequences/single_copy_busco_sequences/*faa | gzip >MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_buscos.${db_name_spec}.faa.gz
          break
      done
      for f in BUSCO/run_${db_name_spec}/busco_sequences/single_copy_busco_sequences/*fna; do
          cat BUSCO/run_${db_name_spec}/busco_sequences/single_copy_busco_sequences/*fna | gzip >MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_buscos.${db_name_spec}.fna.gz
          break
      done
  
  elif egrep -q $'ERROR:	No genes were recognized by BUSCO' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.err ; then
      echo "WARNING: BUSCO analysis failed due to no recognized genes! See also MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.err."
      echo -e "MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa	No genes" > "MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.failed_bin.txt"
  
  elif egrep -q $'INFO:	\S+ selected' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log && egrep -q $'ERROR:	Placements failed' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.err ; then
      echo "WARNING: BUSCO analysis failed due to failed placements! See also MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.err. Still using results for selected generic lineage dataset."
      echo -e "MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa	Placements failed" > "MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.failed_bin.txt"
  
      message=$(egrep $'INFO:	\S+ selected' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log)
      [[ $message =~ INFO:[[:space:]]([_[:alnum:]]+)[[:space:]]selected ]];
      db_name_gen="${BASH_REMATCH[1]}"
      most_spec_db=${db_name_gen}
      echo "Used generic lineage dataset: ${db_name_gen}"
      cp BUSCO/auto_lineage/run_${db_name_gen}/short_summary.txt short_summary.domain.${db_name_gen}.MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa.txt
  
      for f in BUSCO/auto_lineage/run_${db_name_gen}/busco_sequences/single_copy_busco_sequences/*faa; do
          cat BUSCO/auto_lineage/run_${db_name_gen}/busco_sequences/single_copy_busco_sequences/*faa | gzip >MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_buscos.${db_name_gen}.faa.gz
          break
      done
      for f in BUSCO/auto_lineage/run_${db_name_gen}/busco_sequences/single_copy_busco_sequences/*fna; do
          cat BUSCO/auto_lineage/run_${db_name_gen}/busco_sequences/single_copy_busco_sequences/*fna | gzip >MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_buscos.${db_name_gen}.fna.gz
          break
      done
  
  else
      echo "ERROR: BUSCO analysis failed for some unknown reason! See also MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.err." >&2
      exit 1
  fi
  
  # additionally output genes predicted with Prodigal (GFF3)
  if [ -f BUSCO/logs/prodigal_out.log ]; then
      mv BUSCO/logs/prodigal_out.log "MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_prodigal.gff"
  fi
  
  cat <<-END_VERSIONS > versions.yml
  "NFCORE_MAG:MAG:BUSCO_QC:BUSCO":
      python: $(python --version 2>&1 | sed 's/Python //g')
      R: $(R --version 2>&1 | sed -n 1p | sed 's/R version //' | sed 's/ (.*//')
      busco: $(busco --version 2>&1 | sed 's/BUSCO //g')
  END_VERSIONS

Command exit status:
  1

Command output:
  (empty)

Command error:
  ERROR: BUSCO analysis failed for some unknown reason! See also MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.err.

Work dir:
  /media/NGS/nf-core-workflow/work/c2/76795ccd4c946124b7723c02666717

Tip: you can replicate the issue by changing to the process work dir and entering the command `bash .command.run`


Join mismatch for the following entries: 
- key=SPAdes-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.19.fa values= 
- key=MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.10.fa values= 
- key=MEGAHIT-MaxBin2-NG-30689_QN1_4_3_lib613328_10075_2.012.fa values= 
- key=SPAdes-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.25.fa values= 
- key=MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa values= 
- key=SPAdes-MaxBin2-NG-30689_QN1_4_3_lib613328_10075_2.004.fa values= 
- key=MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.9.fa values= 
- key=MEGAHIT-MaxBin2-NG-30689_QN1_4_3_lib613328_10075_2.003.fa values= 
- key=SPAdes-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.7.fa values= 
- key=SPAdes-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.11.fa values=
(more omitted)

Relevant files

MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log
MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.err.txt

System information

N E X T F L O W ~ version 22.04.5
nf-core/mag v2.2.0
Container engine: conda
OS:
Distributor ID: Debian
Description: Debian GNU/Linux 10 (buster)
Release: 10
Codename: buster

Hardware: desktop with 128 Gb RAM and 32 cores

@ChristophKnapp ChristophKnapp added the bug Something isn't working label Aug 25, 2022
@d4straub
Copy link
Collaborator

This seems bad. Could you additionally try using --busco_reference or --busco_download_path. That would mean having the files locally and therefore omitting any downloading step.

@d4straub
Copy link
Collaborator

Also, please do not use -r fix-convert-depths-gzip but -r 2.2.1 ;)

@jboktor
Copy link

jboktor commented Aug 25, 2022

I have seen the same error, even when specifying either (--busco_reference "https://busco-data.ezlab.org/v5/data/lineages/bacteria_odb10.2020-03-06.tar.gz") or (--busco_download_path "path/to/bacteria_odb10)

@nayeimkhan
Copy link

I am facing the same issue as well

@ChristophKnapp
Copy link
Author

Also, please do not use -r fix-convert-depths-gzip but -r 2.2.1 ;)

Hi, I was told to use this flags by @jfy133 because of issue #327.

I will try to resume the analysis after I upgraded to the latest versions with the suggested flags and report the results.

@ChristophKnapp
Copy link
Author

ChristophKnapp commented Aug 26, 2022

As @jboktor I can confirm that using --busco_reference or -busco_download_path does not change the outcome.

@skrakau
Copy link
Member

skrakau commented Aug 26, 2022

Hi, I had a similar problem recently. In my case though it was solvable using -resume multiple times, it only occurred in some BUSCO processes and seemed that the download issue was not reproducible. After a while it worked again, thus I didn't dig deeper. However, I am a bit confused why the same problem occurs when using --busco_download_path, since this is used in combination with the --offline parameter.
I can have a look at this next week again.

@ChristophKnapp
Copy link
Author

@skrakau I think thats because --busco_download_path refers to the directory where the busco lineage files are located. It fails to retrieve https://busco-data.ezlab.org/v5/data/file_versions.tsv, which is not among the lineage files. Please correct me if I'm wrong.

Regards

@skrakau
Copy link
Member

skrakau commented Aug 26, 2022

Hi @ChristophKnapp , yes it refers to the directory containing among others a folder with the lineage files, but this should or could also contain a file_versions.tsv file. The BUSCO user guide says one should download all files from https://busco-data.ezlab.org/v5/data/, which contains a file_versions.tsv file. (Although the example 'valid download folder' doesn't contain this file, but I guess then BUSCO would need to download it. Maybe this would need a bit more documentation for this pipeline.)

The nf-core/mag parameter --busco_download_path causes BUSCO to be run with the BUSCO parameters --offline --download_path <...>, see

p += " --offline --download_path ${download_folder}"

which should prevent BUSCO from trying to download anything. That's why I was confused that it still tries to download the file_versions.tsv file, but if the file is missing it probably makes sense that BUSCO fails.

@skrakau
Copy link
Member

skrakau commented Aug 26, 2022

Remains the question why the download of the file fails, thus talking to the BUSCO developers might be good anyway. If you create an issue, could you link this here? Otherwise I could also do it next week.

@ChristophKnapp
Copy link
Author

ChristophKnapp commented Aug 29, 2022

Otherwise I could also do it next week.

@skrakau, I would prefer if you would do it. You have more insight in what is going on and understand better on how busco is integrated.

Thank you

Christoph

@skrakau
Copy link
Member

skrakau commented Aug 31, 2022

I opened an issue: https://gitlab.com/ezlab/busco/-/issues/593

Feel free to add further details, in case I forgot something.

@skrakau
Copy link
Member

skrakau commented Sep 1, 2022

Apparently there was a rate limit on the BUSCO server introduced a while ago, which probably caused problems in particular when multiple BUSCO processes were running in parallel and which explains why wget works without problems. This rate limit will be increased.
We need to check if this will be sufficient for now. So @ChristophKnapp and @nayeimkhan, let us know if this helps.

Independently of this, we should update BUSCO to version 5.4.x at some point, which contains a failsafe mechanism that reattempts a connection in case of failure.

@nayeimkhan
Copy link

nayeimkhan commented Sep 7, 2022

hi @skrakau , the fix works. Thanks!

@bmlab-sg
Copy link

FYI (maybe that will help someone with similar issue):
I ran into same problem. I am running the pipeline with AWS Batch. I tried --busco_download_path pointing to the local folder with manually unpacked data (as instructed) and for some reason pipeline freeze (with no error, just dead) showing inactive busco process:

process > NFCORE_MAG:MAG:METABAT2_BINNING:MAG_DEPTHS_SUMMARY                   [100%] 1 of 1, cached: 1✔
process > NFCORE_MAG:MAG:BUSCO_QC:BUSCO (SPAdes-B220601001.49.fa)              -

What helped in my case was combination of both:

  • changing container to quay.io/biocontainers/busco:5.4.3--pyhdfd78af_0 (in busco.nf)
  • providing reference --busco_reference "https://busco-data.ezlab.org/v5/data/lineages/bacteria_odb10.2020-03-06.tar.gz"

@skrakau
Copy link
Member

skrakau commented Mar 2, 2023

I will close this issue, as the original download issue due to the rate limit was fixed. Feel free to open a new issue if similar issues occur again.

@bmlab-sg if your issue remains or re-occurs, please open as well a new separate issue.

@skrakau skrakau closed this as completed Mar 2, 2023
@amizeranschi
Copy link
Contributor

Hi @skrakau and @jfy133

I've just run into this old issue now, with version 2.5.4 of the pipeline. My nf-core/mag command specifies the BUSCO DB as such:

--busco_db https://busco-data.ezlab.org/v5/data/lineages/bacteria_odb10.2024-01-08.tar.gz

It's probably a similar issue with multiple BUSCO jobs attempting to access the URL, and their server blocking new connections after a while:

[4d/3595c3] process > NFCORE_MAG:MAG:BUSCO_QC:BUSCO (MEGAHIT-MetaBAT2-SRR16971107.64.fa)                          [ 26%] 551 of 2078, failed: 1

@jfy133
Copy link
Member

jfy133 commented May 28, 2024

I guess the only solution here is to download the database manually I guess :/, and pass that to the pipeline instead

@amizeranschi
Copy link
Contributor

Weirdly enough, I tried this now and STILL get the same error. To be more specific, I downloaded the archive with wget and I am running nf-core/mag with the options --busco_db bacteria_odb10.2024-01-08.tar.gz and resume. I get the following in the output:

[f3/5ae729] process > NFCORE_MAG:MAG:BUSCO_QC:BUSCO (MEGAHIT-MetaBAT2-SRR16971103.67.fa)                          [  4%] 103 of 2078, failed: 4
[-        ] process > NFCORE_MAG:MAG:BUSCO_QC:BUSCO_SUMMARY                                                       -
[8c/69840e] process > NFCORE_MAG:MAG:QUAST_BINS (MEGAHIT-MetaBAT2-unclassified-unrefined-SRR16971104)             [100%] 7 of 7 ✔
[-        ] process > NFCORE_MAG:MAG:QUAST_BINS_SUMMARY                                                           -
[-        ] process > NFCORE_MAG:MAG:CAT                                                                          -
[-        ] process > NFCORE_MAG:MAG:CAT_SUMMARY                                                                  -
[-        ] process > NFCORE_MAG:MAG:GTDBTK:GTDBTK_CLASSIFYWF                                                     -
[-        ] process > NFCORE_MAG:MAG:GTDBTK:GTDBTK_SUMMARY                                                        -
[-        ] process > NFCORE_MAG:MAG:BIN_SUMMARY                                                                  -
[3b/f2dd56] process > NFCORE_MAG:MAG:PROKKA (MEGAHIT-MetaBAT2-SRR16971104.441)                                    [ 99%] 2076 of 2078, cached: 701
[-        ] process > NFCORE_MAG:MAG:CUSTOM_DUMPSOFTWAREVERSIONS                                                  -
[-        ] process > NFCORE_MAG:MAG:MULTIQC                                                                      -
ERROR ~ Error executing process > 'NFCORE_MAG:MAG:BUSCO_QC:BUSCO (MEGAHIT-MetaBAT2-SRR16971103.34.fa)'

Caused by:
  Process `NFCORE_MAG:MAG:BUSCO_QC:BUSCO (MEGAHIT-MetaBAT2-SRR16971103.34.fa)` terminated with an error exit status (1)

Command executed:

  run_busco.sh "--lineage_dataset dataset/bacteria_odb10" "Y" "bacteria_odb10" "MEGAHIT-MetaBAT2-SRR16971103.34.fa" 8 "Y" "N"
  most_spec_db=$(<info_most_spec_db.txt)
  
  cat <<-END_VERSIONS > versions.yml
  "NFCORE_MAG:MAG:BUSCO_QC:BUSCO":
      python: $(python --version 2>&1 | sed 's/Python //g')
      R: $(R --version 2>&1 | sed -n 1p | sed 's/R version //' | sed 's/ (.*//')
      busco: $(busco --version 2>&1 | sed 's/BUSCO //g')
  END_VERSIONS

Command exit status:
  1

Command output:
  (empty)

Command error:
  ERROR: BUSCO analysis failed for some unknown reason! See also MEGAHIT-MetaBAT2-SRR16971103.34.fa_busco.err.

Work dir:
  /data/share/horia-banciu/work/f3/602e4961f0a35ee674d094dc7b6626

And this is the contents of MEGAHIT-MetaBAT2-SRR16971103.34.fa_busco.err:

2024-05-28 05:14:55 ERROR:	Cannot reach https://busco-data2.ezlab.org/v5/data/file_versions.tsv
2024-05-28 05:14:55 ERROR:	BUSCO analysis failed!
2024-05-28 05:14:55 ERROR:	Check the logs, read the user guide (https://busco.ezlab.org/busco_userguide.html), and check the BUSCO issue board on https://gitlab.com/ezlab/busco/issues

Why is BUSCO still trying to access https://busco-data2.ezlab.org/v5/data/file_versions.tsv, when I'm running the pipeline with a local database?

I'm attaching the full log file, in case it helps:
nextflow-busco-url-error.log.txt

@jfy133
Copy link
Member

jfy133 commented May 28, 2024

Ugh that looks bad... Maybe it always does an internet look up?

I've not actually used busco Manually myself... @skrakau if you remember, do you have any ideas?

@jfy133 jfy133 reopened this May 28, 2024
@b-kolar
Copy link

b-kolar commented Jun 14, 2024

Facing the exact same issue, currently testing the --offline flag to see if we can force it to not do an internet lookup.

@jfy133
Copy link
Member

jfy133 commented Jun 15, 2024

Please let me know if it works @b-kolar - I started investigating this yesterday at the airport but couldn't finish before had to fly. Otherwise I'll get back to this on Thursday

@b-kolar
Copy link

b-kolar commented Jun 17, 2024

I can confirm that the --offline flag works with Busco!

We are testing a modified version of the mag pipeline now, which has so far passed the Busco steps without issues.

@jfy133
Copy link
Member

jfy133 commented Jun 18, 2024

Thank you @b-kolar ! I might ping you when my implementation is ready to make sure we added it roughly in the same way, if that's ok ?

@b-kolar
Copy link

b-kolar commented Jun 18, 2024

@jfy133 No problem, feel free to send any questions my way!

@jfy133
Copy link
Member

jfy133 commented Jun 27, 2024

Should be fixed here! @b-kolar could you test -r busco-offline? #633

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

9 participants