Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error running on test data #20

Closed
Rhinogradentia opened this issue Apr 19, 2021 · 16 comments
Closed

Error running on test data #20

Rhinogradentia opened this issue Apr 19, 2021 · 16 comments

Comments

@Rhinogradentia
Copy link

Rhinogradentia commented Apr 19, 2021

Hi,

I'm trying to run the provided testdata with YAMP on a slurm managed HPC with singularity and running into follwing error I don't really understand right now.

YAMP]$ nextflow run YAMP.nf -profile test,singularity
N E X T F L O W  ~  version 20.10.0
Launching `YAMP.nf` [ecstatic_curie] - revision: 521633a8a3
---------------------------------------------
YET ANOTHER METAGENOMIC PIPELINE (YAMP) 
---------------------------------------------

Analysis introspection:
Starting time              : Mon Apr 19 17:02:25 CEST 2021
Environment                : 
Pipeline Name              : YAMP
Pipeline Version           : 0.9.5.2
Config Profile             : test,singularity
Resumed                    : false
Nextflow version           : 20.10.0 build 5430 (01-11-2020 15:14 UTC)
Java version               : 11.0.2
Java Virtual Machine       : OpenJDK 64-Bit Server VM(11.0.2+9)
Operating system           : Linux amd64 v3.10.0-1127.19.1.el7.x86_64
User name                  : 
Container Engine           : singularity
Container                  : [:]
BBmap                      : https://depot.galaxyproject.org/singularity/bbmap:38.87--h1296035_0
FastQC                     : https://depot.galaxyproject.org/singularity/fastqc:0.11.9--0
biobakery                  : biobakery/workflows:3.0.0.a.6.metaphlanv3.0.7
qiime                      : qiime2/core:2020.8
MultiQC                    : https://depot.galaxyproject.org/singularity/multiqc:1.9--py_1
Running parameters         : 
Reads                      : [/scratch/.../YAMP/data/test_data/random_ncbi_reads_with_duplicated_and_contaminants_R1.fastq.gz, /scratch/.../YAMP/data/test_data/random_ncbi_reads_with_duplicated_and_contaminants_R2.fastq.gz]
Prefix                     : test
Running mode               : complete
Layout                     : Paired-End
Performing de-duplication  : true
Synthetic contaminants     : 
Artefacts                  : /scratch/.../YAMP/assets/data/sequencing_artifacts.fa.gz
Phix174ill                 : /scratch/.../YAMP/assets/data/phix174_ill.ref.fa.gz
Adapters                   : /scratch/.../YAMP/assets/data/adapters.fa
Trimming parameters        : 
Input quality offset       : ASCII+33
Min phred score            : 10
Min length                 : 60
kmer lenght                : 23
Shorter kmer               : 11
Max Hamming distance       : 1
Decontamination parameters : 
Contaminant (pan)genome    : /scratch/.../YAMP/assets/demo/genome.fa
Min alignment identity     : 0.95
Max indel length           : 3
executor >  local (3)
[-        ] process > get_software_versions         [  0%] 0 of 1
[d3/963299] process > dedup (test)                  [100%] 1 of 1, failed: 1 ✘
[-        ] process > remove_synthetic_contaminants -
[-        ] process > trim                          -
[-        ] process > index_foreign_genome (1)      -
[-        ] process > decontaminate                 -
[-        ] process > quality_assessment (test)     -
[-        ] process > merge_paired_end_cleaned      -
[-        ] process > profile_taxa                  -
[-        ] process > profile_function              -
[-        ] process > alpha_diversity               -
[-        ] process > log                           -
Error executing process > 'dedup (test)'

Caused by:
  Process `dedup (test)` terminated with an error exit status (255)

Command executed:

  #Sets the maximum memory to the value requested in the config file
  maxmem=$(echo "6 GB" | sed 's/ //g' | sed 's/B//g')
  echo "random_ncbi_reads_with_duplicated_and_contaminants_R1.fastq.gz random_ncbi_reads_with_duplicated_and_contaminants_R2.fastq.gz"
     clumpify.sh -Xmx"$maxmem" in1="random_ncbi_reads_with_duplicated_and_contaminants_R1.fastq.gz" in2="random_ncbi_reads_with_duplicated_and_contaminants_R2.fastq.gz" out1="test_dedup_R1.fq.gz" out2="test_dedup_R2.fq.gz" qin=33 dedupe subs=0 threads=4 &> dedup_mqc.txt
  
  # MultiQC doesn't have a module for clumpify yet. As a consequence, I
  # had to create a YAML file with all the info I need via a bash script
  bash scrape_dedup_log.sh > dedup_mqc.yaml

Command exit status:
  255

Co112.4MiB / 290.9MiB [=============>---------------------] 39 % 63.9 KiB/s 47m39s

Command error:
  INFO:    Downloading network image
  INFO:    Cleaning up incomplete download: /users/.../.singularity/cache/net/tmp_793170624
  FATAL:   context deadline exceeded (Client.Timeout or context cancellation while reading body)

Work dir:
  /scratch/.../YAMP/work/d3/963299d445d79b5575900006a2e695

Tip: when you have fixed the problem you can continue the execution adding the option `-resume` to the run command line

command.log

INFO:    Downloading network image
31.7KiB / 290.9MiB [------------------------------------] 0 % 263.9 KiB/s 18m48s
ESC[1AESC[J47.7KiB / 290.9MiB [------------------------------------] 0 % 198.8 KiB/s 24m57s
[...]
ESC[1AESC[J112.3MiB / 290.9MiB [=============>---------------------] 39 % 63.9 KiB/s 47m39s
ESC[1AESC[J112.3MiB / 290.9MiB [=============>---------------------] 39 % 63.9 KiB/s 47m39s
ESC[1AESC[J112.3MiB / 290.9MiB [=============>---------------------] 39 % 63.9 KiB/s 47m39s
ESC[1AESC[J112.4MiB / 290.9MiB [=============>---------------------] 39 % 63.9 KiB/s 47m38s
ESC[1AESC[J112.4MiB / 290.9MiB [=============>---------------------] 39 % 63.9 KiB/s 47m39s
ESC[1AESC[J112.4MiB / 290.9MiB [=============>---------------------] 39 % 63.9 KiB/s 47m39s
INFO:    Cleaning up incomplete download: /users/.../.singularity/cache/net/tmp_793170624
FATAL:   context deadline exceeded (Client.Timeout or context cancellation while reading body)

command.err

INFO:    Downloading network image
INFO:    Cleaning up incomplete download: /users/.../.singularity/cache/net/tmp_793170624
FATAL:   context deadline exceeded (Client.Timeout or context cancellation while reading body)

command.sh

#!/bin/bash -ue
#Sets the maximum memory to the value requested in the config file
maxmem=$(echo "6 GB" | sed 's/ //g' | sed 's/B//g')
echo "random_ncbi_reads_with_duplicated_and_contaminants_R1.fastq.gz random_ncbi_reads_with_duplicated_and_contaminants_R2.fastq.gz"
   clumpify.sh -Xmx"$maxmem" in1="random_ncbi_reads_with_duplicated_and_contaminants_R1.fastq.gz" in2="random_ncbi_reads_with_duplicated_and_contaminants_R2.fastq.gz" out1="test_dedup_R1.fq.gz" out2="test_dedup_R2.fq.gz" qin=33 dedupe subs=0 threads=4 &> dedup_mqc.txt

# MultiQC doesn't have a module for clumpify yet. As a consequence, I
# had to create a YAML file with all the info I need via a bash script
bash scrape_dedup_log.sh > dedup_mqc.yaml

I've already increased the run time to 90m because the pipeline had a timeout on the dedup process with 15m.
Maybe there is an issue with the downloads? I could use some singularity containers or modules of my own for most of the tools you utilize, would this be possible to do? Or is it possible to preload the images manually?
When running this with -resume I already managed to finish the first step (software versions).

Any help highly appreciated.
Best,
Nadine

EDIT:

And I'm not sure if the qiime link is correct? See above?

@alesssia
Copy link
Owner

Hi @Rhinogradentia,

this looks to me like a problem in downloading the singularity image rather than a problem with YAMP.
To solve it, I would proceed as you suggested, that is to download the singularity images specified in the nextflow.config file in a folder of your choice. Then, tell nextflow to look in that folder by setting the NXF_SINGULARITY_CACHEDIR variable before running YAMP, as in the following:

export NXF_SINGULARITY_CACHEDIR=/path/to/folder/with/singularity/images
nextflow run YAMP.nf -profile test,singularity

Let me know if this works!

@Rhinogradentia
Copy link
Author

Rhinogradentia commented Apr 19, 2021

Hi @alesssia,

Thanks a lot for the fast reply. I will try this and let you know if it was the solution. I think the download speed is quite slow at the moment on my cluster.

One other thing I'd like to ask - for qiime2 the link is like this 'qiime : qiime2/core:2020.8'. Where can I download the correct singularity image for qiime?
Would this be the correct link? singularity pull library://zhifeng/default/qiime2-2020.2:v0605

I think I found it singularity pull docker://qiime2/core:2020.8

Best,
Nadine

@alesssia
Copy link
Owner

Correct! I am closing this issue, but please feel free to re-open it if needed!

@Rhinogradentia
Copy link
Author

Rhinogradentia commented Apr 21, 2021

Hi @alesssia ,
I'm sorry, but I have to reopen this.

I've downloaded/pulled the docker and singularity images like this:

singularity pull https://depot.galaxyproject.org/singularity/bbmap:38.87--h1296035_0
singularity pull https://depot.galaxyproject.org/singularity/fastqc:0.11.9--0
singularity pull  https://depot.galaxyproject.org/singularity/multiqc:1.9--py_1
singularity pull docker://biobakery/workflows:3.0.0.a.6.metaphlanv3.0.7
singularity pull docker://qiime2/core:2020.8

stored it in a subfolder 'images' and set NXF_SINGULARITY_CACHEDIR

$ export NXF_SINGULARITY_CACHEDIR=/scratch/.../YAMP/images/

$ echo $NXF_SINGULARITY_CACHEDIR
/scratch/.../YAMP/images/

$ ls $NXF_SINGULARITY_CACHEDIR
bbmap_38.87--h1296035_0      bbmap-38.87--h1296035_0.sif                        core_2020.8.sif       fastqc-0.11.9--0.sif   multiqc_1.9--py_1.sif   qiime2-core_2020.8.sif
bbmap:38.87--h1296035_0      bbmap:38.87--h1296035_0.sif                        fastqc_0.11.9--0      multiqc_1.9--py_1      multiqc-1.9--py_1.sif   workflows_3.0.0.a.6.metaphlanv3.0.7.sif
bbmap:38.87--h1296035_0.img  biobakery-workflows-3.0.0.a.6.metaphlanv3.0.7.img  fastqc_0.11.9--0.img  multiqc:1.9--py_1      multiqc:1.9--py_1.sif
bbmap_38.87--h1296035_0.sif  biobakery-workflows_3.0.0.a.6.metaphlanv3.0.7.sif  fastqc_0.11.9--0.sif  multiqc_1.9--py_1.img  qiime2-core_2020.8.img

I already tried out several naming conventions for the images as you can see (by creating links).
But besides the biobakery image nothing is found by the pipeline:

.nextflow.log

[...]
Apr-21 12:39:58.772 [Actor Thread 6] DEBUG nextflow.container.SingularityCache - Singularity found local store for image=docker://biobakery/workflows:3.0.0.a.6.metaphlanv3.0.7; path=/scratch/.../YAMP/images/biobakery
-workflows-3.0.0.a.6.metaphlanv3.0.7.img
[...]
Apr-21 13:10:13.636 [Task monitor] DEBUG nextflow.file.FileHelper - NFS path (true): /scratch/.../YAMP/work
Apr-21 13:10:13.659 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[jobId: 21528714; id: 3; name: index_foreign_genome (1); status: COMPLETED; exit: 255; error: -; workDir: /scratch/.../YAMP/work/f2/69a44f86bc4b944241852cdeab1834 started: 1619001603471; exited: 2021-04-21T11:10:00.983972Z; ]
Apr-21 13:10:13.718 [Task monitor] ERROR nextflow.processor.TaskProcessor - Error executing process > 'index_foreign_genome (1)'

Caused by:
  Process `index_foreign_genome (1)` terminated with an error exit status (255)

Command executed:

  #Sets the maximum memory to the value requested in the config file
  maxmem=$(echo 6 GB | sed 's/ //g' | sed 's/B//g')
  
  # This step will have a boilerplate log because the information saved by bbmap are not relevant
  bbmap.sh -Xmx"$maxmem" ref=genome.fa &> foreign_genome_index_mqc.txt

Command exit status:
  255

Command output:
  ESC[1AESC[J116.8MiB / 290.9MiB [=============>---------------------] 40 % 66.7 KiB/s 44m31s
  ESC[1AESC[J116.8MiB / 290.9MiB [=============>---------------------] 40 % 66.7 KiB/s 44m31s
  ESC[1AESC[J116.8MiB / 290.9MiB [=============>---------------------] 40 % 66.7 KiB/s 44m31s
  ESC[1AESC[J116.8MiB / 290.9MiB [=============>---------------------] 40 % 66.7 KiB/s 44m31s
  ESC[1AESC[J116.8MiB / 290.9MiB [=============>---------------------] 40 % 66.7 KiB/s 44m31s
  ESC[1AESC[J116.8MiB / 290.9MiB [=============>---------------------] 40 % 66.7 KiB/s 44m31s
  ESC[1AESC[J116.9MiB / 290.9MiB [=============>---------------------] 40 % 66.7 KiB/s 44m31s
  ESC[1AESC[J116.9MiB / 290.9MiB [=============>---------------------] 40 % 66.7 KiB/s 44m30s
  ESC[1AESC[J116.9MiB / 290.9MiB [=============>---------------------] 40 % 66.7 KiB/s 44m30s
  ESC[1AESC[J116.9MiB / 290.9MiB [=============>---------------------] 40 % 66.7 KiB/s 44m31s
  ESC[1AESC[J116.9MiB / 290.9MiB [=============>---------------------] 40 % 66.7 KiB/s 44m31s
  ESC[1AESC[J116.9MiB / 290.9MiB [=============>---------------------] 40 % 66.7 KiB/s 44m30s
  ESC[1AESC[J116.9MiB / 290.9MiB [=============>---------------------] 40 % 66.7 KiB/s 44m31s
  ESC[1AESC[J116.9MiB / 290.9MiB [=============>---------------------] 40 % 66.7 KiB/s 44m31s
  ESC[1AESC[J116.9MiB / 290.9MiB [=============>---------------------] 40 % 66.7 KiB/s 44m30s
  ESC[1AESC[J116.9MiB / 290.9MiB [=============>---------------------] 40 % 66.7 KiB/s 44m30s
  ESC[1AESC[J116.9MiB / 290.9MiB [=============>---------------------] 40 % 66.7 KiB/s 44m30s
  ESC[1AESC[J116.9MiB / 290.9MiB [=============>---------------------] 40 % 66.7 KiB/s 44m30s
  ESC[1AESC[J116.9MiB / 290.9MiB [=============>---------------------] 40 % 66.7 KiB/s 44m30s
  ESC[1AESC[J117.0MiB / 290.9MiB [=============>---------------------] 40 % 66.7 KiB/s 44m29s
  ESC[1AESC[J117.0MiB / 290.9MiB [=============>---------------------] 40 % 66.7 KiB/s 44m30s
  ESC[1AESC[J117.0MiB / 290.9MiB [=============>---------------------] 40 % 66.7 KiB/s 44m29s
  ESC[1AESC[J117.0MiB / 290.9MiB [=============>---------------------] 40 % 66.7 KiB/s 44m29s
  ESC[1AESC[J117.0MiB / 290.9MiB [=============>---------------------] 40 % 66.7 KiB/s 44m29s
  ESC[1AESC[J117.0MiB / 290.9MiB [=============>---------------------] 40 % 66.7 KiB/s 44m29s
  ESC[1AESC[J117.0MiB / 290.9MiB [=============>---------------------] 40 % 66.7 KiB/s 44m28s
  ESC[1AESC[J117.0MiB / 290.9MiB [=============>---------------------] 40 % 66.7 KiB/s 44m28s
  ESC[1AESC[J117.0MiB / 290.9MiB [=============>---------------------] 40 % 66.7 KiB/s 44m28s
  ESC[1AESC[J117.0MiB / 290.9MiB [=============>---------------------] 40 % 66.7 KiB/s 44m28s
  ESC[1AESC[J117.0MiB / 290.9MiB [=============>---------------------] 40 % 66.7 KiB/s 44m28s
  ESC[1AESC[J117.1MiB / 290.9MiB [=============>---------------------] 40 % 66.7 KiB/s 44m27s
  ESC[1AESC[J117.1MiB / 290.9MiB [=============>---------------------] 40 % 66.7 KiB/s 44m27s
  ESC[1AESC[J117.1MiB / 290.9MiB [=============>---------------------] 40 % 66.7 KiB/s 44m27s
  ESC[1AESC[J117.1MiB / 290.9MiB [=============>---------------------] 40 % 66.7 KiB/s 44m27s
  ESC[1AESC[J117.1MiB / 290.9MiB [=============>---------------------] 40 % 66.7 KiB/s 44m27s
  ESC[1AESC[J117.1MiB / 290.9MiB [=============>---------------------] 40 % 66.7 KiB/s 44m27s
  ESC[1AESC[J117.1MiB / 290.9MiB [=============>---------------------] 40 % 66.7 KiB/s 44m27s
  ESC[1AESC[J117.1MiB / 290.9MiB [=============>---------------------] 40 % 66.7 KiB/s 44m27s
  ESC[1AESC[J117.1MiB / 290.9MiB [=============>---------------------] 40 % 66.7 KiB/s 44m27s
  ESC[1AESC[J117.1MiB / 290.9MiB [=============>---------------------] 40 % 66.7 KiB/s 44m26s
  ESC[1AESC[J117.1MiB / 290.9MiB [=============>---------------------] 40 % 66.7 KiB/s 44m26s
  ESC[1AESC[J117.1MiB / 290.9MiB [=============>---------------------] 40 % 66.7 KiB/s 44m26s
  ESC[1AESC[J117.2MiB / 290.9MiB [=============>---------------------] 40 % 66.7 KiB/s 44m26s
  ESC[1AESC[J117.2MiB / 290.9MiB [=============>---------------------] 40 % 66.7 KiB/s 44m26s
  ESC[1AESC[J117.2MiB / 290.9MiB [=============>---------------------] 40 % 66.7 KiB/s 44m25s
  ESC[1AESC[J117.2MiB / 290.9MiB [=============>---------------------] 40 % 66.7 KiB/s 44m25s
  ESC[1AESC[J117.2MiB / 290.9MiB [=============>---------------------] 40 % 66.7 KiB/s 44m25s
  ESC[1AESC[J117.2MiB / 290.9MiB [=============>---------------------] 40 % 66.7 KiB/s 44m24s
  ESC[1AESC[J117.2MiB / 290.9MiB [=============>---------------------] 40 % 66.7 KiB/s 44m24s
  ESC[1AESC[J117.2MiB / 290.9MiB [=============>---------------------] 40 % 66.7 KiB/s 44m24s
ESC[1AESC[J

Command error:
  INFO:    Downloading network image
  INFO:    Cleaning up incomplete download: /users/.../.singularity/cache/net/tmp_293246274
  FATAL:   context deadline exceeded (Client.Timeout or context cancellation while reading body)

Work dir:
  /scratch/.../YAMP/work/f2/69a44f86bc4b944241852cdeab1834

Tip: you can try to figure out what's wrong by changing to the process work dir and showing the script file named `.command.sh`
Apr-21 13:10:13.728 [Task monitor] DEBUG nextflow.Session - Session aborted -- Cause: Process `index_foreign_genome (1)` terminated with an error exit status (255)
Apr-21 13:10:13.760 [Task monitor] DEBUG nextflow.Session - The following nodes are still active:
[process] trim
  status=ACTIVE
  port 0: (queue) OPEN  ; channel: -
  port 1: (cntrl) -     ; channel: $

[process] decontaminate
  status=ACTIVE
  port 0: (queue) OPEN  ; channel: -
  port 1: (cntrl) -     ; channel: $

[process] quality_assessment
  status=ACTIVE
  port 0: (queue) OPEN  ; channel: -
  port 1: (cntrl) -     ; channel: $

[process] profile_taxa
  status=ACTIVE
  port 0: (queue) OPEN  ; channel: -
  port 1: (queue) OPEN  ; channel: bowtie2db
  port 2: (cntrl) -     ; channel: $

[process] profile_function
  status=ACTIVE
  port 0: (queue) OPEN  ; channel: -
  port 1: (queue) OPEN  ; channel: -
  port 2: (queue) OPEN  ; channel: chocophlan
  port 3: (queue) OPEN  ; channel: uniref
  port 4: (cntrl) -     ; channel: $

[process] alpha_diversity
  status=ACTIVE
  port 0: (queue) OPEN  ; channel: -
  port 1: (cntrl) -     ; channel: $
[process] log
  status=ACTIVE
  port 0: (value) bound ; channel: multiqc_config
  port 1: (value) bound ; channel: workflow_summary
  port 2: (value) bound ; channel: software_versions_mqc.yaml
  port 3: (value) OPEN  ; channel: fastqc/*
  port 4: (queue) OPEN  ; channel: dedup_mqc.yaml
  port 5: (queue) OPEN  ; channel: synthetic_contaminants_mqc.yaml
  port 6: (queue) OPEN  ; channel: trimming_mqc.yaml
  port 7: (queue) OPEN  ; channel: foreign_genome_indexing_mqc.yaml
  port 8: (queue) OPEN  ; channel: decontamination_mqc.yaml
  port 9: (queue) OPEN  ; channel: merge_paired_end_cleaned_mqc.yaml
  port 10: (queue) OPEN  ; channel: profile_taxa_mqc.yaml
  port 11: (queue) OPEN  ; channel: profile_functions_mqc.yaml
  port 12: (queue) OPEN  ; channel: alpha_diversity_mqc.yaml
  port 13: (cntrl) -     ; channel: $

Apr-21 13:10:13.763 [main] DEBUG nextflow.Session - Session await > all process finished
Apr-21 13:10:13.764 [main] DEBUG nextflow.Session - Session await > all barriers passed
Apr-21 13:10:13.781 [main] WARN  n.processor.TaskPollingMonitor - Killing pending tasks (2)
Apr-21 13:10:13.796 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[jobId: 21528716; id: 4; name: quality_assessment (test); status: COMPLETED; exit: 255; error: -; workDir: /scratch/.../YAMP/work/e5/30dfad5c056c5ad8e0c2491a2dcf47 started: 1619001603494; exited: 2021-04-21T11:10:00.42031Z; ]
Apr-21 13:10:13.809 [main] DEBUG nextflow.trace.WorkflowStatsObserver - Workflow completed > WorkflowStats[succeededCount=2; failedCount=2; ignoredCount=0; cachedCount=0; pendingCount=0; submittedCount=0; runningCount=-1; retriesCount=0; abortedCount=2; succeedDuration=4m 40s; failedDuration=8h 2m 44s; cachedDuration=0ms;loadCpus=-8; loadMemory=0; peakRunning=3; peakCpus=24; peakMemory=20 GB; ]
Apr-21 13:10:13.982 [main] DEBUG nextflow.CacheDB - Closing CacheDB done
Apr-21 13:10:14.015 [main] DEBUG nextflow.util.SpuriousDeps - AWS S3 uploader shutdown
Apr-21 13:10:14.091 [main] DEBUG nextflow.script.ScriptRunner - > Execution complete -- Goodbye

The path seems to work. So I think the naming is the problem, which naming convention is the pipeline looking for?

Thanks again for any help!

Best,
Nadine

@alesssia
Copy link
Owner

I am not sure of the naming convention, but my images are called:

bbmap:38.87--h1296035_0   fastqc:0.11.9--0   multiqc:1.9--py_1 
biobakery-workflows-3.0.0.a.6.metaphlanv3.0.7.img   qiime2-core-2020.8.img

I cannot spot any error in your commands. Are you submitting YAMP to your job scheduler? Is this folder accessible to all computing nodes?

You can also modify the nextflow.config file (singularity profile) as follow:

 singularity {
    singularity.enabled = true
    singularity.cacheDir = /path/to/folder/with/singularity/images
  }

Please let me know if this works!

@alesssia alesssia reopened this Apr 21, 2021
@Rhinogradentia
Copy link
Author

Rhinogradentia commented Apr 22, 2021

Hi @alesssia,
unfortunately, this didn't work either. I'm sure the directories are all readable for the cluster.

I tried to increase the singularity pullTimeout, but somehow this didn't have any effect.

What brought me a step further was to add the container-path to processes in the test.config file.
Now I get another error - but I will first try to resolve it myself.

executor >  slurm (9)
[ba/3ef63b] process > get_software_versions          [100%] 1 of 1 ✔
[b6/bef0ec] process > dedup (test)                   [100%] 1 of 1 ✔
[1a/7144b2] process > remove_synthetic_contaminan... [100%] 1 of 1 ✔
[81/5df126] process > trim (test)                    [100%] 1 of 1 ✔
[48/bc1a62] process > index_foreign_genome (1)       [100%] 1 of 1 ✔
[bf/b25515] process > decontaminate (test)           [100%] 1 of 1 ✔
[42/56d310] process > quality_assessment (test)      [100%] 1 of 1
[-        ] process > merge_paired_end_cleaned       -
[26/cb0680] process > profile_taxa (test)            [100%] 1 of 1, failed: 1 ✘
[-        ] process > profile_function               -
[-        ] process > alpha_diversity                -
[-        ] process > log                            -
Error executing process > 'profile_taxa (test)'

Caused by:
  Process `profile_taxa (test)` terminated with an error exit status (1)

Command executed:

  #If a file with the same name is already present, Metaphlan2 used to crash, leaving this here just in case
  rm -rf test_bt2out.txt
  
  metaphlan --input_type fastq --tmp_dir=. --biom test.biom --bowtie2out=test_bt2out.txt --bowtie2db metaphlan_databases --bt2_ps very-sensitive --add_viruses --sample_id test --nproc 8 test_QCd.fq.gz test_metaphlan_bugs_list.tsv &> profile_taxa_mqc.txt
  
  # MultiQC doesn't have a module for Metaphlan yet. As a consequence, I
  # had to create a YAML file with all the info I need via a bash script
  bash scrape_profile_taxa_log.sh test_metaphlan_bugs_list.tsv > profile_taxa_mqc.yaml

Command exit status:
  1

Command output:
  (empty)

Work dir:
  /scratch/.../YAMP/work/26/cb0680a91e8c8cbc25b11a81b2e51d

Thanks a lot for your help
Best,
Nadine

@alesssia
Copy link
Owner

Let me know how it goes!

@Rhinogradentia
Copy link
Author

Hi @alesssia

The last problem came from the metaphlan db versions - after downloading them again it vanished.

But, now I get this:

executor >  slurm (2)
[ba/3ef63b] process > get_software_versions          [100%] 1 of 1, cached: 1 ✔
[b6/bef0ec] process > dedup (test)                   [100%] 1 of 1, cached: 1 ✔
[1a/7144b2] process > remove_synthetic_contaminan... [100%] 1 of 1, cached: 1 ✔
[81/5df126] process > trim (test)                    [100%] 1 of 1, cached: 1 ✔
[48/bc1a62] process > index_foreign_genome (1)       [100%] 1 of 1, cached: 1 ✔
[bf/b25515] process > decontaminate (test)           [100%] 1 of 1, cached: 1 ✔
[90/c02a56] process > quality_assessment (test)      [100%] 2 of 2, cached: 2 ✔
[-        ] process > merge_paired_end_cleaned       -
[93/49c401] process > profile_taxa (test)            [100%] 1 of 1, cached: 1 ✔
[-        ] process > profile_function (test)        -
[ed/4489f1] process > alpha_diversity (test)         [100%] 1 of 1, failed: 1 ✘
[-        ] process > log                            -
Error executing process > 'alpha_diversity (test)'

Caused by:
  Process `alpha_diversity (test)` terminated with an error exit status (1)

Command executed:

  #It checks if the profiling was successful, that is if identifies at least three species
  n=$(grep -o s__ test.biom | wc -l  | cut -d" " -f 1)
  if (( n <= 3 )); then
  	#The file should be created in order to be returned
  	touch test_alpha_diversity.tsv 
  else
  	echo test > test_alpha_diversity.tsv
  	qiime tools import --input-path test.biom --type 'FeatureTable[Frequency]' --input-format BIOMV100Format --output-path test_abundance_table.qza
  	echo "2" > test_alpha_diversity.tsv 
  	for alpha in ace berger_parker_d brillouin_d chao1 chao1_ci dominance doubles enspie esty_ci fisher_alpha gini_index goods_coverage heip_e kempton_taylor_q lladser_pe margalef mcintosh_d mcintosh_e menhinick michaelis_menten_fit osd pielou_e robbins shannon simpson simpson_e singles strong
  	do
  		qiime diversity alpha --i-table test_abundance_table.qza --p-metric $alpha --output-dir $alpha &> /dev/null
  		echo "3.1" >> test_alpha_diversity.tsv 
  		qiime tools export --input-path $alpha/alpha_diversity.qza --output-path ${alpha} &> /dev/null
  		echo "3.2" >> test_alpha_diversity.tsv 
  		value=$(sed -n '2p' ${alpha}/alpha-diversity.tsv | cut -f 2)
  		echo "3.3." >> test_alpha_diversity.tsv 
  	    	echo -e  $alpha'	'$value 
  		echo "3.4" >> test_alpha_diversity.tsv 
  	done >> test_alpha_diversity.tsv  
  	echo "loop finished" >> test_alpha_diversity.tsv 
  fi
  
  # MultiQC doesn't have a module for qiime yet. As a consequence, I
  # had to create a YAML file with all the info I need via a bash script
  bash generate_alpha_diversity_log.sh ${n} > alpha_diversity_mqc.yaml

Command exit status:
  1

Command output:
  (empty)

Command error:
  Traceback (most recent call last):
    File "/opt/conda/envs/qiime2-2020.8/lib/python3.6/site-packages/q2cli/builtin/tools.py", line 158, in import_data
      view_type=input_format)
    File "/opt/conda/envs/qiime2-2020.8/lib/python3.6/site-packages/qiime2/sdk/result.py", line 241, in import_data
      validate_level='max')
    File "/opt/conda/envs/qiime2-2020.8/lib/python3.6/site-packages/qiime2/sdk/result.py", line 273, in _from_view
      provenance_capture=provenance_capture)
    File "/opt/conda/envs/qiime2-2020.8/lib/python3.6/site-packages/qiime2/core/archive/archiver.py", line 316, in from_data
      Format.write(rec, type, format, data_initializer, provenance_capture)
    File "/opt/conda/envs/qiime2-2020.8/lib/python3.6/site-packages/qiime2/core/archive/format/v5.py", line 21, in write
      provenance_capture)
    File "/opt/conda/envs/qiime2-2020.8/lib/python3.6/site-packages/qiime2/core/archive/format/v1.py", line 19, in write
      provenance_capture)
    File "/opt/conda/envs/qiime2-2020.8/lib/python3.6/site-packages/qiime2/core/archive/format/v0.py", line 62, in write
      data_initializer(data_dir)
    File "/opt/conda/envs/qiime2-2020.8/lib/python3.6/site-packages/qiime2/core/path.py", line 37, in _move_or_copy
      return _ConcretePath.rename(self, other)
    File "/opt/conda/envs/qiime2-2020.8/lib/python3.6/pathlib.py", line 1309, in rename
      self._accessor.rename(self, target)
    File "/opt/conda/envs/qiime2-2020.8/lib/python3.6/pathlib.py", line 393, in wrapped
      return strfunc(str(pathobjA), str(pathobjB), *args)
  FileExistsError: [Errno 17] File exists: '/tmp/q2-BIOMV210DirFmt-_5c_f1px' -> '/tmp/qiime2-archive-zn0r4jma/0bf95333-bd7f-4bd0-8726-645c0016e339/data'
  
  An unexpected error has occurred:
  
    [Errno 17] File exists: '/tmp/q2-BIOMV210DirFmt-_5c_f1px' -> '/tmp/qiime2-archive-zn0r4jma/0bf95333-bd7f-4bd0-8726-645c0016e339/data'
  
  See above for debug info.

Work dir:
  /scratch/.../YAMP/work/ed/4489f1bb767b4759e6e3009acbfe8f

Tip: you can replicate the issue by changing to the process work dir and entering the command `bash .command.run`

I found this https://forum.qiime2.org/t/errno-17-file-exists-during-classify/6554 where they assume it has something to do with access/write rights and I think this concerns the container.
Did you ever got this or a similar error?

After adding some debugging lines I assume it is the first qiime line (tools import) which throws this error.

Thank you a lot
Best,
Nadine

@alesssia
Copy link
Owner

alesssia commented May 6, 2021

Hi @Rhinogradentia,

I never had this error. I have also asked other users I know, and this is a very new thing.

I see that you are using a slurm executor, are you using the singularity container(s)?
What happens if you go in the reported work dir (/scratch/.../YAMP/work/ed/4489f1bb767b4759e6e3009acbfe8f) and execute the commands within .command.run? Please be aware that the qiime2 call should be executed within the qiime2 container (singularity run qiime2/core:2020.8 ...)

@Rhinogradentia
Copy link
Author

Rhinogradentia commented May 7, 2021

Yes, I'm using singularity container - pre-downloaded.

I started over again without resume, followed by the same error as above. If I afterwards shell into the container and execute the commands from .command.run everything seems to work fine. Excecuting .command.sh alone inside the container also works:

abe7051ac8ae936a1283bb2d022745]$ singularity shell ../../../images/qiime2-core-2020.8.img 
Singularity> bash .command.sh 
Imported test.biom as BIOMV100Format to test_abundance_table.qza

Executing .command.sh via exec with the container works besides the last shell script:

abe7051ac8ae936a1283bb2d022745]$ singularity exec ../../../images/qiime2-core-2020.8.img bash .command.sh 
Imported test.biom as BIOMV100Format to test_abundance_table.qza
bash: generate_alpha_diversity_log.sh: No such file or directory

Even outside the container when running .command.run it seems to work:

abe7051ac8ae936a1283bb2d022745]$ bash .command.run 
Imported test.biom as BIOMV100Format to test_abundance_table.qza

I searched for this error again and found another person with this problem with stand-alone qiime (without any workflow): https://forum.qiime2.org/t/error-when-executing-qiime-tools-import-script-on-a-server/7790

As suggested there I tried it with a manually defined tmpdir and this worked. So if anyone also stumbles upon this - it seemingly has something to do with the server/cluster network and inconsistencies there. Workaround - create a local tmpdir and export it accordingly before running qiime2.

export TMPDIR=/path/to/your/desired/local/tmp
mkdir $TMPDIR

After this, the test-data workflow ran smoothly.

Thank you a lot for your support and help.

Best,
Nadine

@alesssia
Copy link
Owner

Hi @Rhinogradentia ,
sorry for the extremely and unforgivable late feedback.

I wanted to included this on the wiki troubleshooting page (linking to this issue and acknowledging the fact that you found the solution, of course!) and I just wanted to confirm that you just exported the tmpdir before running YAMP? No luck on my side on generating this issue...

Thanks a lot,
Alessia

@Rhinogradentia
Copy link
Author

Rhinogradentia commented Aug 16, 2021

Hi @alesssia,

no problem :-)

And yes - I can confirm that I just created a local tmp directory and exported it - this solved the qiime error.

I hope this can be helpful or someone else.

Best,
Nadine

@alesssia
Copy link
Owner

Great, thanks a lot!
Alessia

@Rhinogradentia
Copy link
Author

Rhinogradentia commented Mar 6, 2023

Hi Hannah,

I think this is one process earlier in the pipeline and not the same error. Take a look in /nfs/users/rg/hbenisty/obbs_yamp/work/85/bfa6cda091b13a195246185ae93b4f/.command.log - maybe there is a better explanation for this error.

Regarding the tmp directory. I just created it in the same location I initiated the pipeline.
Best,
Nadine

@HannahBenisty
Copy link

HannahBenisty commented Mar 7, 2023 via email

@Rhinogradentia
Copy link
Author

Rhinogradentia commented Mar 7, 2023

Hi Hannah,

I never had this error. But from looking at the error message I would say the biobakery image can't be executed or is not recognized as a valid image. Check if it is actually available, readable and executable.

[...]
Command error:
[91mERROR : Unknown image format/type: /nfs/users/rg/hbenisty/obbs_yamp_copy/work/singularity/biobakery-workflows-3.0.0.a.6.metaphlanv3.0.7.img
[0m[31mABORT : Retval = 255
[0m

Work dir:
/nfs/users/rg/hbenisty/obbs_yamp_copy/work/93/75e6b47b4fed6d864e04a7ef37795f

Tip: you can try to figure out what's wrong by changing to the process work dir and showing the script file named `.command.sh`


Execution cancelled -- Finishing pending tasks before exit
[c9/2aa2da] NOTE: Process `dedup (paired_end_complete)` terminated with an error exit status (140) -- Execution is retried (1)

Again - take a look at the .command files in /nfs/users/rg/hbenisty/obbs_yamp_copy/work/93/75e6b47b4fed6d864e04a7ef37795f to get more information on the error. You can also try to execute the command (.command.sh) which threw the error inside the container manually to see what happens.

Best,
Nadine

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants