Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

singularity parallel pull error of docker images #1210

Closed
tbugfinder opened this issue Jul 1, 2019 · 8 comments
Closed

singularity parallel pull error of docker images #1210

tbugfinder opened this issue Jul 1, 2019 · 8 comments

Comments

@tbugfinder
Copy link
Contributor

tbugfinder commented Jul 1, 2019

Bug report

Parallel executions of singularity pull from dockerhub might result in an error.
Root cause seems to be some kind of race condition (or by design?) within singularity. The error is not raised every time.

Expected behavior and actual behavior

In case parallel singularity pull executions fail within nextflow, such pulls should be retried for the failing image. Maybe even pulls shouldn't be parallel.

Steps to reproduce the problem

process test1810 {
    container = 'ubuntu:18.10'
    script:
    """
    cat /etc/os-release
    """
}

process test1804 {
    container = 'ubuntu:18.04'
    script:
    """
    cat /etc/os-release
    """
}
#
#while true ; do date; rm -Rf work/singularity ; ~/nextflow/bin/nextflow run test.nf -with-singularity ; sleep 5 ; rm -Rf work/singularity ; sleep 6 ; done


Program output

Mon Jul  1 20:02:51 CEST 2019
N E X T F L O W  ~  version 19.04.1
Launching `test.nf` [extravagant_sammet] - revision: 3660c710ee
WARN: There's no process matching config selector: trim_galore
[warm up] executor > local
WARN: Singularity cache directory has not been defined -- Remote image will be stored in the path: /tmp/psimg_k/work/singularity
Pulling Singularity image docker://ubuntu:18.10 [cache /tmp/psimg_k/work/singularity/ubuntu-18.10.img]
Pulling Singularity image docker://ubuntu:18.04 [cache /tmp/psimg_k/work/singularity/ubuntu-18.04.img]
ERROR ~ Error executing process > 'test1804'

Caused by:
  Failed to pull singularity image
  command: singularity pull  --name ubuntu-18.04.img docker://ubuntu:18.04 > /dev/null
  status : 255
  message:
    [34mINFO:    Starting build...
    Getting image source signatures
    Skipping fetch of repeat blob sha256:5b7339215d1d5f8e68622d584a224f60339f5bef41dbd74330d081e912f0cddd
    Skipping fetch of repeat blob sha256:14ca88e9f6723ce82bc14b241cda8634f6d19677184691d086662641ab96fe68
    Skipping fetch of repeat blob sha256:a31c3b1caad473a474d574283741f880e37c708cc06ee620d3e93fa602125ee0
    Skipping fetch of repeat blob sha256:b054a26005b7f3b032577f811421fab5ec3b42ce45a4012dfa00cf6ed6191b0f
    Copying config sha256:84c9d0762469176a58d9c54375a1e9a4dcdc0045e3e14e191d97bd12cd5b23a1

     0 B / 2.42 KiB [--------------------------------------------------------------]
     2.42 KiB / 2.42 KiB [======================================================] 0s
    Writing manifest to image destination
    Storing signatures
    INFO:    Creating SIF file...
    FATAL:   Unable to pull docker://ubuntu:18.04: While creating SIF: while creating container: container file creation failed: open ubuntu-18.04.img: no such file or directory



 -- Check '.nextflow.log' file for details


Environment

  • Nextflow version: 19.04.01
  • Java version: "1.8.0_161"
  • Operating system: Linux

Additional context

There's also a more or less stale issue open at Singularity.
https://github.com/sylabs/singularity/issues/3634

FYI: @apeltzer

@pditommaso
Copy link
Member

I think this has to be managed by Singularity.

@tbugfinder
Copy link
Contributor Author

Well, yes and no.
Certainly a parallel pull has to be fixed within singularity. However nextflow should also evaluate return codes. So if a pull fails it could be retried.

@piotr-faba-ardigen
Copy link

Also experiencing this at the moment. Retrying after 255 error would be intuitive.

@piotr-faba-ardigen
Copy link

Though same error code is given if the image does not exist actually, so I wonder if we have any option of verifying this before attempting a retry.

rsuchecki added a commit to plantinformatics/pretzel-input-generator that referenced this issue Jan 3, 2020
rsuchecki added a commit to plantinformatics/pretzel-input-generator that referenced this issue Jan 8, 2020
* use small genomes to generate examples and stramline input definitions

* corrected urls

* relaxed allowed target name regex

* stingency settings not ensembl specific, moved to main config

* refactoring gtf/gff3 fileds def

* major re-work of input staging and multitude of related changes

* updated repr pep filtering

* relaxed req to include supercontigs not just chromosomes

* added sequencesToPlace spec to test config

* restored core functionality after re-structure

* cleanup, comments

* added samtools container def

* optional faidx process if idx no provided

* added data set from non Esembl source

* generalised gff3-based pep conversion to Ensembl style, also allows pass-through of already existing records

* allowing user-specified chromosome id pattern for block and feature JSON generation

* updated and documented test data sets

* travis stub

* opted for smaller samtools container

* hack to handle gz (not bgz) files fro chr lengths

* minor

* Update README.md

* Update .travis.yaml

* Update .travis.yaml

* Update .travis.yaml

* Update .travis.yaml

* test profile with local data

* travis data download and untar

* travis fixes

* ubu version for travis

* updated dep

* for GH actions

* docker user change for GH actions

* docker groovy test for GHA

* docker user

* docker grp exists

* added go for singularity

* added groovy image with ps

* reconf

* test profile updates

- fix for groovy @grab failing with singularity (read only file-system)
- fix errorStrategy config

* added Singularity install to GH actions

* Singularity dependencies @ GH actions

* working around https://github.com/sylabs/singularity/issues/3634

* test singularity pull form docker

* explicit use of gawk - may matter on alpine

* workaround for

nextflow-io/nextflow#1210
sylabs/singularity#3634

* leaner fastx container

* fastx and reconf

* fix path to script, renamed tasks

* test wspace path

* added missing script, fixed GH actions cmd

* ansi-lo on and try docker again

* docker workflow test

* fix typo

* fix typo

* fix for permission denied GH actions (?)

* fix for groovy grapes in docker

* test

* test

* test

* test

* another docker test

* GH A job.needs experiemnt

* GH A tidy

* GH A fix indent

* GH A fix job

* added GH actions CI badge

* re-implemented: duplicate emissions if multiple annotations per reference assembly

* updated datastes in line with feature dev

* another badge ver

* fix

* added EP datasets

* ensure non-empty process out

* generalised for different gff3 interpretations

* Delete .travis.yaml

* Update README.md

* Update README.md

* At & Bd ref fasta not needed

* speeding things up:  gawk in jq container and up resources

* do not report markers placed outside pseudochromosomes (e.g. on scaffolds)

* id pattern match extended to seq placement

* redundant-ish

* added TOC
@stale
Copy link

stale bot commented Apr 27, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix label Apr 27, 2020
@tbugfinder
Copy link
Contributor Author

Still an issue.

@toniher
Copy link
Contributor

toniher commented Feb 10, 2022

In case it can be useful, workaround script for dealing with this when you are facing this problem: https://gist.github.com/toniher/65ccf76a8903bf432435d490b2025fab

@pditommaso
Copy link
Member

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants