overview of available options
- for ad hoc workflows and tools: docker and singularity
- for lightweight application: jupyter notebooks
- for pipelines: shell scripts in github
- for software tools: bioconda/biocontainer
- software development + software operations
- automate and monitor
https://www.zdnet.com/article/what-is-docker-and-why-is-it-so-darn-popular/
pros and cons
- ++ very similar to a full OS
- ++ high OS diversity
- -- need of more space and resources
- -- slower than containers
- -- not as good automation
pros and cons
- ++ faster
- ++ no need for full OS
- ++ easy solutions for distribution of recipes. high portability
- ++ easy to automate
- -- still OS dependant solutions
- -- not real OS in some cases
- platform for developing, shipping, and running applications
- infrastructure as application/code
- Open Container Initiative
- Docker community edition
- read-only templates
- containers are run from them
- images are not run
- images have several layers
- can be built from existing images
- ubuntu, alpine
- any modification from base image is a new layer ( tip: use && )
- base images can be created with tools such as Debootstrap
- images have several layers
- Recipe: Dockerfile
- Instructions
- FROM
- ADD, COPY
- RUN
- ENV PATH, ARG
- USER, WORKDIR, LABEL
- VOLUME, EXPOSE
- CMD, (ENTRYPOINT)
- start from packages e.g. pip/PyPI, CPAN, or CRAN
- use versions for tools and containers
- use ENV PATH instead of ENTRYPOINT
- reduce size as much as possible
- keep data outside the container
- check the license
- make your container discoverable e.g. biocontainers, quay.io, docker hub
FROM biocontainers/biocontainers:v1.0.0_cv4
LABEL base_image=“biocontainers:v1.0.0_cv4”
LABEL version=“3”
LABEL software=“Comet”
LABEL software.version=“2016012”
LABEL about.summary=“an open source tandem mass spectrometry sequence database search tool”
LABEL about.home=http://comet-ms.sourceforge.net
LABEL about.documentation=http://comet-ms.sourceforge.net/parameters/parameters_2016010
LABEL about.license_file=http://comet-ms.sourceforge.net
LABEL about.license=“SPDX:Apache-2.0”
LABEL extra.identifiers.biotools=“comet”
LABEL about.tags=“Proteomics”
LABEL maintainer=“Felipe da Veiga Leprevost <felipe@leprevost.com.br>”
USER biodocker
RUN ZIP=comet_binaries_2016012.zip && wget https://github.com/BioDocker/software-archive/releases/download/Comet/$ZIP-O/tmp/$ZIP&&unzip/tmp/$ZIP-d/home/biodocker/bin/Comet/&&chmod-R 755/home/biodocker/bin/Comet/*&&rm/tmp/$ZIP
RUN mv/home/biodocker/bin/Comet/comet_binaries_2016012/comet.2016012.linux.exe/home/biodocker/bin/Comet/comet
ENV PATH /home/biodocker/bin/Comet:$PATH
WORKDIR /data/
- impact of docker containers on performance
- container-based virtualization for HPC environments
- article recommendations containers
- article Grüning on virtualization
- Bioinfo Core at CRG slides