From b98174a6269c190d4455c0036f9960f23f5b9ab5 Mon Sep 17 00:00:00 2001 From: Phil Ewels Date: Sat, 28 Sep 2024 23:16:46 +0200 Subject: [PATCH] Minor tweak to text based on some Slack feedback + self-review --- .../blog/2024/seqera-containers-part-2.mdx | 104 +++++++++--------- 1 file changed, 53 insertions(+), 51 deletions(-) diff --git a/sites/main-site/src/content/blog/2024/seqera-containers-part-2.mdx b/sites/main-site/src/content/blog/2024/seqera-containers-part-2.mdx index b0ba82cf5e..89c8ddc2a6 100644 --- a/sites/main-site/src/content/blog/2024/seqera-containers-part-2.mdx +++ b/sites/main-site/src/content/blog/2024/seqera-containers-part-2.mdx @@ -45,9 +45,7 @@ It's mostly to serve as an architectural plan for the nf-core maintainers and in Before we dig into how the details of how the automation will work, let's summarise the end goal of this migration: -> "What we call the beginning is often the end. And to make an end is to make a beginning. The end is where to start from." -> -> _T.S. Eliot_ +> "What we call the beginning is often the end. And to make an end is to make a beginning. The end is where to start from."
_T.S. Eliot_ ## Usage summary @@ -82,6 +80,37 @@ Glossary: - [Apptainer](https://apptainer.org/): Alternative to Singularity, uses same image format - [Mamba](https://mamba.readthedocs.io): Alternative to Conda, uses same conda environment files +:::tip{.fa-comment-question title="Singularity: oras or https?" collapse} + +Unfamiliar with `oras://`? Don't worry, it's relatively new in the field. +It's a new protocol to reference container images, similar to `docker://` or `shub://`. +It allows Singularity to interact with any OCI ([Open Container Initiative](https://opencontainers.org/)) +compliant registry to pull images. + +Using `oras` has some advantages: + +- Singularity handles pulls in the process task, rather than it happening in the Nextflow head job + - This means less resource usage on the head node, and more parallelisation +- Singularity can use authentication to pull from private registries + (see [Singularity docs](https://docs.sylabs.io/guides/main/user-guide/cli/singularity_registry.html)). + +However, there are some downsides: + +- Shared cache Nextflow options such as `$NXF_SINGULARITY_CACHEDIR` and `$NXF_SINGULARITY_LIBRARYDIR` do not work +- Singularity must be installed to download images for offline use +- It's not supported by all versions of Singularity / Apptainer. + +As such, we will continue to use `https` downloads for Singularity `SIF` images for now. +However, we will start to provide new `-profile singularity_oras` profiles for anyone who +would prefer to fetch images using the newer `oras` protocol. + +If you'd like to know more, check out the amazing [bytesize talk](https://nf-co.re/events/2024/bytesize_singularity_containers_hpc) +by Marco Claudio De La Pierre ([@marcodelapierre](https://github.com/marcodelapierre/)) from June 2024: + + + +::: + ## Modules All nf-core pipelines use a single container per process, and the majority of processes are @@ -199,16 +228,16 @@ https://conda.anaconda.org/conda-forge/linux-64/bzip2-1.0.8-hd590300_5.conda#69b ## Pipelines -Containers at module-level are great, but we need to tie these into the pipelines where they will run. -Our guiding principles were to try to avoid changing current usage behaviour, -with full automation for pipeline maintainers. +Container information at shared module-level is great, but it's not enough. +Nextflow doesn't know about module `meta.yml` files (they're an nf-core invention), +so we need to somehow tie these into the pipeline code where they will run. -The heart of the solution is that each container / conda environment will have a config file -auto-generated by the nf-core/tools CLI. -These files will not be edited by hand, so no manual merging will be required. +The heart of the solution is to auto-generate a config file for each software packaging type (Docker, Singularity, Conda) +and platform (`linux/arch64` and `linux/arm64`). +These will be created by the nf-core/tools CLI and never be edited by hand, so no manual merging will be required. They'll simply be regenerated and overwritten every time a module is changed. -The config files will specify the `container` or `conda` directive for every process in the pipeline: +Each config file will specify the `container` or `conda` directive for every process in the pipeline: ```groovy title="config/containers_docker_amd64.config" // AUTOGENERATED CONFIG FILE - DO NOT EDIT @@ -228,54 +257,27 @@ process { withName: 'NF_PIPELINE:ANALYSIS_PLOTS' { conda = 'https://wave.seqera. //.. and so on, for each process in the pipeline ``` -Singularity will have config files for both `oras` and `https` containers, so that users can choose which to use. - -:::tip{.fa-comment-question title="Singularity: oras or https?" collapse} - -Unfamiliar with `oras`? Don't worry, it's relatively new in the field. -It's a new protocol to reference container images, similar to `docker://` or `shub://`. -It allows Singularity to interact with any OCI ([Open Container Initiative](https://opencontainers.org/)) -compliant registry to pull images. - -Using `oras` has some advantages: - -- Singularity handles pulls in the process task, rather than it happening in the Nextflow head job - - This means less resource usage on the head node, and more parallelisation -- Singularity can use authentication to pull from private registries - (see [Singularity docs](https://docs.sylabs.io/guides/main/user-guide/cli/singularity_registry.html)). - -However, there are some downsides: +The main `nextflow.config` file will import these config files, depending on the [profile selected](#usage-summary) +by the person running the pipeline. -- Shared cache Nextflow options such as `$NXF_SINGULARITY_CACHEDIR` and `$NXF_SINGULARITY_LIBRARYDIR` do not work -- Singularity must be installed to download images for offline use -- It's not yet supported by all Singularity versions. - -As such, we will continue to use `https` downloads for Singularity `SIF` images for now. -However, we will start to provide new `-profile singularity_oras` profiles for anyone who -would prefer to fetch images using the newer `oras` protocol. - -If you'd like to know more, check out the amazing [bytesize talk](https://nf-co.re/events/2024/bytesize_singularity_containers_hpc) -by Marco Claudio De La Pierre ([@marcodelapierre](https://github.com/marcodelapierre/)) from June 2024: - - - -::: - -The main `nextflow.config` file will import these depending on the profile selected. +Singularity will have separate config files and associated `-profile`s for both `oras` and `https` containers, +so that users can choose which to use. We're taking this opportunity to update the `apptainer` and `mamba` profiles too, they will import the exact same config files as the `singularity` and `conda` profiles. +Here's roughly how the `nextflow.config` file with the `-profile` config includes will look: + +::::info{.fa-code title="nextflow.config" collapse} + :::note Boilerplate code (eg. disabling other container engines) has been removed from this `nextflow.config` code snippet for clarity. -We may move this code into it's own separate config file with `includeConfig` +We may move this whole code block into it's own separate config file with `includeConfig` so that the main `nextflow.config` file is easier to read. ::: -:::info{.fa-code title="nextflow.config (collapsed as it's quite long)" collapse} - -```groovy +```groovy title="nextflow.config" // Set container for docker amd64 by default includeConfig 'config/containers_docker_amd64.config' @@ -352,7 +354,7 @@ apptainer.registry = 'oras://community.wave.seqera.io/library' singularity.registry = 'oras://community.wave.seqera.io/library' ``` -::: +:::: Note that there are a few changes here: @@ -362,10 +364,10 @@ Note that there are a few changes here: - The `conda` profiles now use Conda lockfiles instead of `environment.yml` files - New `conda_local` profiles for those wanting to keep the old behaviour - New `mamba` profiles, using the `conda` config files -- Base registry set to Seqera Containers +- Base registries set to Seqera Containers -Because we're setting the base registry still, it should still be simple -to mirror containers to custom Docker registries and overwrite only +Because we're only defining the image name and making use of the base container registry config option, +it should still be simple to mirror containers to custom Docker registries and overwrite only `docker.registry` as before. # Automation - Modules