This documentation has been adapted from the nf-core guidelines (see https://nf-co.re/usage/installation for details).
To start using this pipeline, follow the steps below:
Nextflow runs on most POSIX systems (Linux, Mac OSX etc). It can be installed by running the following commands:
# Make sure that Java v8+ is installed:
java -version
# Install Nextflow
curl -fsSL get.nextflow.io | bash
# Add Nextflow binary to your PATH:
mv nextflow ~/bin/
# OR system-wide installation:
# sudo mv nextflow /usr/local/bin
See nextflow.io for further instructions on how to install and configure Nextflow.
You just need to download/clone the source code and transfer the pipeline files manually:
wget https://mypipeline/archive/master.zip
mkdir -p ~/mypipelines
unzip master.zip -d ~/mypipelines/
cd ~/mypipelines
nextflow run main.nf
If you would like to make changes to the pipeline, it's better to fork the repository on your github account and then clone the pipeline from your personal repository. Once cloned, you can run the pipeline directly as above.
This current version of the pipeline is compatible with the geniac
utility for automatic production deployment.
See the docs/geniac.md
page for details and geniac.
The default conf/base.config
and conf/process.config
configuration are loaded by the pipeline.
They define some default parameters in particular the minimal computing resource requirements to run the processes. They are suitable for running the pipeline on a simple local server. Note that a few variables related to software dependencies can be changed in this configuration file.
Be aware of two important points about this default configuration:
- The default profile uses the
local
executor- All jobs run in the login session. If you use a simple server, this may be fine. If you use a computing cluster, take care of not running all the jobs on the submission node.
- See the nextflow docs for information about running with other hardware backends. Most job scheduler systems are natively supported.
- Nextflow will expect all software to be installed and available on the
PATH
environment variable.- It's expected to use an additional config profile for conda, singularity and Docker (See below).
The required resources for each process are defined in the configuration conf/process.config
.
These resources are usually defined using the label
directive associated with the number of RAM/CPUs available for each job.
This file can be edited if one wants to change the resources allocated to a specific process.
Software dependencies can be managed in several ways. We provide here an overview and we recommend the user to read carefully the Profiles section to properly use the different possibilities.
By default, Nextflow expects all the tools to be installed and available in the PATH
environment variable.
This path can be set using the -profile path
combined with the --globalPath /my/path
option specifying where the tools are installed.
In addition, the -profile multipath
is available in order to specify the PATH
for each tool, instead of a global one.
If you're not able to use singularity or Docker, you can instead use conda to manage the software requirements. This is slower and less reproducible than the above, but is still better than having to install all requirements yourself! The pipeline ships with a conda environment file and Nextflow has a built-in support for this.
To use it first ensure that you have conda installed (we recommend to use miniconda), then follow the same pattern as above and use the flag -profile conda
Note that in this case, the conda environment will be created in the $HOME/conda-cache-nextflow
folder by default. This folder can be changed using the --condaCacheDir
option.
In addition to a general conda environment, this pipeline also comes with a -profile multiconda
setting. In this case, a conda environment per tool (note that each process is assigned a tool with the label directive) will be created.
This configuration is useful if different processes require different tool versions (leading to potential conda conflicts thus making impossible to use the -profile conda
option).
Using singularity is in general a great idea to manage environment and ensure reproducibility.
The process is very similar: running the pipeline with the option -profile singularity
tells Nextflow to enable singularity to run the pipeline.
Containers with all of the software requirements can be automatically generated using the recipes
information.
Once available, the user can specified where to look for the images using the option --singularityImagePath /my/path/to/singularity/containers
.
A generic configuration profile is available with -profile docker
.
In this case, the pipeline will look for Docker
containers as defined in the conf/docker.config
.
By default, we set up a cluster
profile to execute the pipeline on a computational cluster.
Please, edit the cluster.config
file to set up your own cluster configuration.