Skip to content

Commit

Permalink
Minor tweak to text based on some Slack feedback + self-review
Browse files Browse the repository at this point in the history
  • Loading branch information
ewels committed Sep 28, 2024
1 parent cbe7538 commit b98174a
Showing 1 changed file with 53 additions and 51 deletions.
104 changes: 53 additions & 51 deletions sites/main-site/src/content/blog/2024/seqera-containers-part-2.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -45,9 +45,7 @@ It's mostly to serve as an architectural plan for the nf-core maintainers and in

Before we dig into how the details of how the automation will work, let's summarise the end goal of this migration:

> "What we call the beginning is often the end. And to make an end is to make a beginning. The end is where to start from."
>
> _T.S. Eliot_
> "What we call the beginning is often the end. And to make an end is to make a beginning. The end is where to start from."<br /><small>_T.S. Eliot_</small>
## Usage summary

Expand Down Expand Up @@ -82,6 +80,37 @@ Glossary:
- [Apptainer](https://apptainer.org/): Alternative to Singularity, uses same image format
- [Mamba](https://mamba.readthedocs.io): Alternative to Conda, uses same conda environment files

:::tip{.fa-comment-question title="Singularity: oras or https?" collapse}

Unfamiliar with `oras://`? Don't worry, it's relatively new in the field.
It's a new protocol to reference container images, similar to `docker://` or `shub://`.
It allows Singularity to interact with any OCI ([Open Container Initiative](https://opencontainers.org/))
compliant registry to pull images.

Using `oras` has some advantages:

- Singularity handles pulls in the process task, rather than it happening in the Nextflow head job
- This means less resource usage on the head node, and more parallelisation
- Singularity can use authentication to pull from private registries
(see [Singularity docs](https://docs.sylabs.io/guides/main/user-guide/cli/singularity_registry.html)).

However, there are some downsides:

- Shared cache Nextflow options such as `$NXF_SINGULARITY_CACHEDIR` and `$NXF_SINGULARITY_LIBRARYDIR` do not work
- Singularity must be installed to download images for offline use
- It's not supported by all versions of Singularity / Apptainer.

As such, we will continue to use `https` downloads for Singularity `SIF` images for now.
However, we will start to provide new `-profile singularity_oras` profiles for anyone who
would prefer to fetch images using the newer `oras` protocol.

If you'd like to know more, check out the amazing [bytesize talk](https://nf-co.re/events/2024/bytesize_singularity_containers_hpc)
by Marco Claudio De La Pierre ([@marcodelapierre](https://github.com/marcodelapierre/)) from June 2024:

<YouTube id="https://www.youtube.com/watch?v=zoCC_dkhjD0" poster="http://i3.ytimg.com/vi/zoCC_dkhjD0/hqdefault.jpg" />

:::

## Modules

All nf-core pipelines use a single container per process, and the majority of processes are
Expand Down Expand Up @@ -199,16 +228,16 @@ https://conda.anaconda.org/conda-forge/linux-64/bzip2-1.0.8-hd590300_5.conda#69b

## Pipelines

Containers at module-level are great, but we need to tie these into the pipelines where they will run.
Our guiding principles were to try to avoid changing current usage behaviour,
with full automation for pipeline maintainers.
Container information at shared module-level is great, but it's not enough.
Nextflow doesn't know about module `meta.yml` files (they're an nf-core invention),
so we need to somehow tie these into the pipeline code where they will run.

The heart of the solution is that each container / conda environment will have a config file
auto-generated by the nf-core/tools CLI.
These files will not be edited by hand, so no manual merging will be required.
The heart of the solution is to auto-generate a config file for each software packaging type (Docker, Singularity, Conda)
and platform (`linux/arch64` and `linux/arm64`).
These will be created by the nf-core/tools CLI and never be edited by hand, so no manual merging will be required.
They'll simply be regenerated and overwritten every time a module is changed.

The config files will specify the `container` or `conda` directive for every process in the pipeline:
Each config file will specify the `container` or `conda` directive for every process in the pipeline:

```groovy title="config/containers_docker_amd64.config"
// AUTOGENERATED CONFIG FILE - DO NOT EDIT
Expand All @@ -228,54 +257,27 @@ process { withName: 'NF_PIPELINE:ANALYSIS_PLOTS' { conda = 'https://wave.seqera.
//.. and so on, for each process in the pipeline
```

Singularity will have config files for both `oras` and `https` containers, so that users can choose which to use.

:::tip{.fa-comment-question title="Singularity: oras or https?" collapse}

Unfamiliar with `oras`? Don't worry, it's relatively new in the field.
It's a new protocol to reference container images, similar to `docker://` or `shub://`.
It allows Singularity to interact with any OCI ([Open Container Initiative](https://opencontainers.org/))
compliant registry to pull images.

Using `oras` has some advantages:

- Singularity handles pulls in the process task, rather than it happening in the Nextflow head job
- This means less resource usage on the head node, and more parallelisation
- Singularity can use authentication to pull from private registries
(see [Singularity docs](https://docs.sylabs.io/guides/main/user-guide/cli/singularity_registry.html)).

However, there are some downsides:
The main `nextflow.config` file will import these config files, depending on the [profile selected](#usage-summary)
by the person running the pipeline.

- Shared cache Nextflow options such as `$NXF_SINGULARITY_CACHEDIR` and `$NXF_SINGULARITY_LIBRARYDIR` do not work
- Singularity must be installed to download images for offline use
- It's not yet supported by all Singularity versions.

As such, we will continue to use `https` downloads for Singularity `SIF` images for now.
However, we will start to provide new `-profile singularity_oras` profiles for anyone who
would prefer to fetch images using the newer `oras` protocol.

If you'd like to know more, check out the amazing [bytesize talk](https://nf-co.re/events/2024/bytesize_singularity_containers_hpc)
by Marco Claudio De La Pierre ([@marcodelapierre](https://github.com/marcodelapierre/)) from June 2024:

<YouTube id="https://www.youtube.com/watch?v=zoCC_dkhjD0" poster="http://i3.ytimg.com/vi/zoCC_dkhjD0/hqdefault.jpg" />

:::

The main `nextflow.config` file will import these depending on the profile selected.
Singularity will have separate config files and associated `-profile`s for both `oras` and `https` containers,
so that users can choose which to use.

We're taking this opportunity to update the `apptainer` and `mamba` profiles too,
they will import the exact same config files as the `singularity` and `conda` profiles.

Here's roughly how the `nextflow.config` file with the `-profile` config includes will look:

::::info{.fa-code title="nextflow.config" collapse}

:::note
Boilerplate code (eg. disabling other container engines) has been removed from this
`nextflow.config` code snippet for clarity.
We may move this code into it's own separate config file with `includeConfig`
We may move this whole code block into it's own separate config file with `includeConfig`
so that the main `nextflow.config` file is easier to read.
:::

:::info{.fa-code title="nextflow.config (collapsed as it's quite long)" collapse}

```groovy
```groovy title="nextflow.config"
// Set container for docker amd64 by default
includeConfig 'config/containers_docker_amd64.config'
Expand Down Expand Up @@ -352,7 +354,7 @@ apptainer.registry = 'oras://community.wave.seqera.io/library'
singularity.registry = 'oras://community.wave.seqera.io/library'
```

:::
::::

Note that there are a few changes here:

Expand All @@ -362,10 +364,10 @@ Note that there are a few changes here:
- The `conda` profiles now use Conda lockfiles instead of `environment.yml` files
- New `conda_local` profiles for those wanting to keep the old behaviour
- New `mamba` profiles, using the `conda` config files
- Base registry set to Seqera Containers
- Base registries set to Seqera Containers

Because we're setting the base registry still, it should still be simple
to mirror containers to custom Docker registries and overwrite only
Because we're only defining the image name and making use of the base container registry config option,
it should still be simple to mirror containers to custom Docker registries and overwrite only
`docker.registry` as before.

# Automation - Modules
Expand Down

0 comments on commit b98174a

Please sign in to comment.