Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docker run fail for demo data #7

Closed
wangdatou2009 opened this issue Mar 13, 2018 · 7 comments
Closed

docker run fail for demo data #7

wangdatou2009 opened this issue Mar 13, 2018 · 7 comments

Comments

@wangdatou2009
Copy link

wangdatou2009 commented Mar 13, 2018

Hi Alessia,
I tried to use docker and run locally in linux 16.04, I tried steps like below, and what's wrong with the error ? the docker image was pulled completely, but the system still unable to find image 'yampdocker:latest' locally

  1. No tools install, and no resource downloaded
  2. git clone https://github.com/alesssia/YAMP.git
  3. install nextflow
  4. download ERR011089_1.fastq.gz and ERR011089_2.fastq.gz
  5. docker pull alesssia/yampdocker
  6. annotate executor and queue in nextflow.config
  7. nextflow run YAMP.nf --reads1 ./data/ERR011089_1.fastq.gz --reads2 ./data/ERR011089_2.fastq.gz --prefix Meta_HIT_ERR011089 --outdir ./data --mode complete -with-docker yampdocker

Error
N E X T F L O W ~ version 0.28.0
Launching YAMP.nf [angry_shockley] - revision: 8ed2c9d795
[warm up] executor > local
[ff/30bff9] Submitted process > dedup
[78/3e0203] Submitted process > qualityAssessment (1)
[51/e88aa9] Submitted process > qualityAssessment (2)
ERROR ~ Error executing process > 'dedup'

Caused by:
Process dedup terminated with an error exit status (125)

Command executed:

#Measures execution time
sysdate=$(date)
starttime=$(date +%s.%N)
echo "Performing Quality Control. STEP 1 [De-duplication] at $sysdate" > .log.2
echo " " >> .log.2

#Sets the maximum memory to the value requested in the config file
maxmem=$(echo "32 GB" | sed 's/ //g' | sed 's/B//g')

#Defines command for de-duplication
if [ "paired" = "paired" ]; then
CMD="clumpify.sh -Xmx"$maxmem" in1=ERR011089_1.fastq.gz in2=ERR011089_2.fastq.gz out1=Meta_HIT_ERR011089_dedupe_R1.fq.gz out2=Meta_HIT_ERR011089_dedupe_R2.fq.gz qin=33 dedupe subs=0 threads=4"
else
CMD="clumpify.sh -Xmx"$maxmem" in=ERR011089_1.fastq.gz out=Meta_HIT_ERR011089_dedupe.fq.gz qin=33 dedupe subs=0 threads=4"
fi

#Logs version of the software and executed command (BBmap prints on stderr)
version=$(clumpify.sh --version 2>&1 >/dev/null | grep "BBMap version")
echo "Using clumpify.sh in $version " >> .log.2
echo "Executing command: $CMD " >> .log.2
echo " " >> .log.2

#De-duplicates
exec $CMD 2>&1 | tee tmp.log

#Logs some figures about sequences passing de-duplication
echo "Clumpify's de-duplication stats: " >> .log.2
echo " " >> .log.2
sed -n '/Reads In:/,/Duplicates Found:/p' tmp.log >> .log.2
echo " " >> .log.2
totR=$(grep "Reads In:" tmp.log | cut -f 1 | cut -d: -f 2 | sed 's/ //g')
remR=$(grep "Duplicates Found:" tmp.log | cut -f 1 | cut -d: -f 2 | sed 's/ //g')
survivedR=$(($totR-$remR))
percentage=$(echo $survivedR $totR | awk '{print $1/$2*100}' )
echo "$survivedR out of $totR paired reads survived de-duplication ($percentage%, $remR reads removed)" >> .log.2
echo " " >> .log.2

#Measures and logs execution time
endtime=$(date +%s.%N)
exectime=$(echo "$endtime $starttime" | awk '{print $1-$2}')
sysdate=$(date)
echo "STEP 1 (Quality control) terminated at $sysdate ($exectime seconds)" >> .log.2
echo " " >> .log.2
echo "++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++" >> .log.2
echo " " >> .log.2

Command exit status:
125

Command output:
(empty)

Command error:
Unable to find image 'yampdocker:latest' locally
docker: Error response from daemon: repository yampdocker not found: does not exist or no pull access.
See 'docker run --help'.

Work dir:
.....................YAMP/work/ff/30bff9a2c9c06f891f279d055ca0b3

Tip: view the complete command output by changing to the process work dir and entering the command cat .command.out

-- Check '.nextflow.log' file for details
WARN: Killing pending tasks (2)

@wangdatou2009
Copy link
Author

And if I tried
nextflow run YAMP.nf --reads1 ./data/ERR011089_1.fastq.gz --reads2 ./data/ERR011089_2.fastq.gz --prefix Meta_HIT_ERR011089 --outdir ./data --mode complete -with-docker docker://alesssia/yampdocker

basically the same

N E X T F L O W ~ version 0.28.0
Launching YAMP.nf [cocky_lovelace] - revision: 8ed2c9d795
[warm up] executor > local
[5d/aef143] Submitted process > dedup
[6c/5db1ee] Submitted process > qualityAssessment (1)
[51/90f83d] Submitted process > qualityAssessment (2)
ERROR ~ Error executing process > 'dedup'

Caused by:
Process dedup terminated with an error exit status (125)

Command executed:

#Measures execution time
sysdate=$(date)
starttime=$(date +%s.%N)
echo "Performing Quality Control. STEP 1 [De-duplication] at $sysdate" > .log.2
echo " " >> .log.2

#Sets the maximum memory to the value requested in the config file
maxmem=$(echo "32 GB" | sed 's/ //g' | sed 's/B//g')

#Defines command for de-duplication
if [ "paired" = "paired" ]; then
CMD="clumpify.sh -Xmx"$maxmem" in1=ERR011089_1.fastq.gz in2=ERR011089_2.fastq.gz out1=Meta_HIT_ERR011089_dedupe_R1.fq.gz out2=Meta_HIT_ERR011089_dedupe_R2.fq.gz qin=33 dedupe subs=0 threads=4"
else
CMD="clumpify.sh -Xmx"$maxmem" in=ERR011089_1.fastq.gz out=Meta_HIT_ERR011089_dedupe.fq.gz qin=33 dedupe subs=0 threads=4"
fi

#Logs version of the software and executed command (BBmap prints on stderr)
version=$(clumpify.sh --version 2>&1 >/dev/null | grep "BBMap version")
echo "Using clumpify.sh in $version " >> .log.2
echo "Executing command: $CMD " >> .log.2
echo " " >> .log.2

#De-duplicates
exec $CMD 2>&1 | tee tmp.log

#Logs some figures about sequences passing de-duplication
echo "Clumpify's de-duplication stats: " >> .log.2
echo " " >> .log.2
sed -n '/Reads In:/,/Duplicates Found:/p' tmp.log >> .log.2
echo " " >> .log.2
totR=$(grep "Reads In:" tmp.log | cut -f 1 | cut -d: -f 2 | sed 's/ //g')
remR=$(grep "Duplicates Found:" tmp.log | cut -f 1 | cut -d: -f 2 | sed 's/ //g')
survivedR=$(($totR-$remR))
percentage=$(echo $survivedR $totR | awk '{print $1/$2*100}' )
echo "$survivedR out of $totR paired reads survived de-duplication ($percentage%, $remR reads removed)" >> .log.2
echo " " >> .log.2

#Measures and logs execution time
endtime=$(date +%s.%N)
exectime=$(echo "$endtime $starttime" | awk '{print $1-$2}')
sysdate=$(date)
echo "STEP 1 (Quality control) terminated at $sysdate ($exectime seconds)" >> .log.2
echo " " >> .log.2
echo "++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++" >> .log.2
echo " " >> .log.2

Command exit status:
125

Command output:
(empty)

Command error:
docker: Error parsing reference: "docker://alesssia/yampdocker" is not a valid repository/tag: invalid reference format.
See 'docker run --help'.

Work dir:
......................YAMP/work/5d/aef143354299874734a019ccc6a418

Tip: when you have fixed the problem you can continue the execution appending to the nextflow command line the option -resume

-- Check '.nextflow.log' file for details
WARN: Killing pending tasks (1)

@wangdatou2009
Copy link
Author

Then tried that
nextflow run YAMP.nf --reads1 ./data/ERR011089_1.fastq.gz --reads2 ./data/ERR011089_2.fastq.gz --prefix Meta_HIT_ERR011089 --outdir ./data --mode QC -with-docker alesssia/yampdocker

the error goes like:
N E X T F L O W ~ version 0.28.0
Launching YAMP.nf [suspicious_hodgkin] - revision: 8ed2c9d795
[warm up] executor > local
[3d/e2c7b0] Submitted process > dedup
[00/2cf546] Submitted process > qualityAssessment (2)
[0d/e89764] Submitted process > qualityAssessment (1)
ERROR ~ Error executing process > 'qualityAssessment (2)'

Caused by:
Process qualityAssessment (2) terminated with an error exit status (1)

Command executed:

#Measures execution time
sysdate=$(date)
starttime=$(date +%s.%N)
echo "Performing Quality Control. [Assessment of read quality] at $sysdate" > .log.1_R2
echo "File being analysed: ERR011089_2.fastq.gz" >> .log.1_R2
echo " " >> .log.1_R2

#Logs version of the software and executed command
version=$(fastqc --version)
CMD="fastqc --quiet --noextract --format fastq --outdir=. --threads 4 ERR011089_2.fastq.gz"

echo "Using $version " >> .log.1_R2
echo "Executing command $CMD " >> .log.1_R2
echo " " >> .log.1_R2

#Does QC, extracts relevant information, and removes temporary files
bash fastQC.sh ERR011089_2.fastq.gz Meta_HIT_ERR011089_rawreads_R2 4 ERR011089_2.fastq.gz

#Logging QC statistics (number of sequences, Pass/warning/fail, basic statistics, duplication level, kmers)
base=$(basename ERR011089_2.fastq.gz)
bash logQC.sh $base Meta_HIT_ERR011089_rawreads_R2_fastqc_data.txt .log.1_R2

#Measures and log execution time
endtime=$(date +%s.%N)
exectime=$(echo "$endtime $starttime" | awk '{print $1-$2}')
sysdate=$(date)
echo "Quality assessment on ERR011089_2.fastq.gz terminated at $sysdate ($exectime seconds)" >> .log.1_R2
echo " " >> .log.1_R2
echo "++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++" >> .log.1_R2
echo " " >> .log.1_R2

Command exit status:
1

Command output:
(empty)

Command error:
WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.
touch: cannot touch ‘.command.trace’: Permission denied

Work dir:
.......................YAMP/work/00/2cf54638379a3c17505901c72d838c

Tip: you can try to figure out what's wrong by changing to the process work dir and showing the script file named .command.sh

-- Check '.nextflow.log' file for details
WARN: Killing pending tasks (1)

@wangdatou2009
Copy link
Author

fixed
add docker.runOptions = '-u $(id -u):$(id -g)' in nextflow.config

@alesssia
Copy link
Owner

Thanks for letting me know. I will add this to the troubleshooting page.
Hope everything is all right now?

@wangdatou2009
Copy link
Author

Thanks for your reply. The only left issue is the folder structure for uniref90 did not match the one in config.
And another question is even for docker user, we still to arrange the resource folder by myself? the image you created do not include any resource, right?

@alesssia
Copy link
Owner

I will have a look at the config file.

Correct. The image does not contain any resources but you can download them either from Zenodo (https://zenodo.org/record/1068229#.Wh7a3rTQqL4), or by using the following command:

wget https://zenodo.org/record/1068229/files/YAMP_resources_20171128.tar.gz

If you use this data file, please note that, before running YAMP, the FASTA file describing the human (contaminating) genome should be indexed with the following command:

bbmap.sh -Xmx24G ref=hg19_main_mask_ribo_animal_allplant_allfungus.fa.gz

Hope this helps!

@alesssia
Copy link
Owner

olgabot added a commit to olgabot/nextflow that referenced this issue Apr 6, 2019
As mentioned [here](alesssia/YAMP#7), this error has been plaguing a few people:

```
Command error:
WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.
```

I could find the solution in the [YAMP wiki](https://github.com/alesssia/YAMP/wiki/How-to-use-Docker) but not in the Nextflow documentation so here it is.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants