This pipeline performs INDEL calling on isotype strains versus the reference strain. The VCFs output from this pipeline are used within the lab and also released to the world via CaeNDR.
"""
______ _______ _ _ _ _______
( __ \ ( ____ \( \ ( \ |\ /| ( ( /|( ____ \
| ( \ )| ( \/| ( | ( ( \ / ) | \ ( || ( \/
| | ) || (__ | | | | \ (_) /_____ | \ | || (__
| | | || __) | | | | \ /(_____)| (\ \) || __)
| | ) || ( | | | | ) ( | | \ || (
| (__/ )| (____/\| (____/\| (____/\| | | ) \ || )
(______/ (_______/(_______/(_______/\_/ |/ )_)|/
nextflow main.nf --help
nextflow main.nf -profile rockfish --debug
nextflow main.nf -profile rockfish --sample_sheet=isotype_groups.tsv --species=c_elegans
parameters description Set/Default
========== =========== ========================
--debug Set to 'true' to test (optional)
--sample_sheet TSV with column isotype (needs header) (required)
--masking BED file containing regions to skip during indel calling (optional)
--minsize The minimum size in bp to report for INDELs (optional, default: 50bp)
--maxsize The maximum size in bp to report for INDELs (optional, default: 1000bp)
--output Output folder name (optional) (optional)
--species Species: 'c_elegans', 'c_tropicalis' or 'c_briggsae' (required/optional)
or
--bam_dir Path to folder containing bams (optional/required)
--reference Path to reference genome fasta (optional/required)
HELP: https://andersenlab.org/dry-guide/latest/pipelines/pipeline-delly
----------------------------------------------------------------------------------------------
"""
- The latest update requires Nextflow version 23+. On Rockfish, you can access this version by loading the
nf23_env
conda environment prior to running the pipeline command:
module load python/anaconda
source activate /data/eande106/software/conda_envs/nf24_env
Note: if you are having issues running Nextflow or need reminders, check out the Nextflow page.
This command uses a test dataset
nextflow run -latest andersenlab/delly-nf --debug
You should run this in a screen or tmux session.
nextflow run -latest andersenlab/delly-nf --sample_sheet <path_to_sample_sheet> --species <species>
or
nextflow run -latest andersenlab/delly-nf --sample_sheet <path_to_sample_sheet> --bam_dir <path_to_bam_folder> --reference <path_to_reference>
There are three configuration profiles for this pipeline.
rockfish
- Used for running on Rockfish (default).quest
- Used for running on Quest.local
- Used for local development.
Note
If you forget to add a -profile
, the rockfish
profile will be chosen as default
You should use --debug
for testing/debugging purposes. This will run the debug test set (located in the test_data
folder).
For example:
nextflow run -latest andersenlab/delly-nf --debug
Using --debug
will automatically set the sample sheet to test_data/sample_sheet.tsv
A sample sheet produced by the concordance pipeline with a column specifying isotype reference strains with the column header "isotype".
Options: c_elegans, c_briggsae, or c_tropicalis
Path to the folder containing species strain bams.
Path to the reference strain fasta file.
Path to bed file containing regions to be skipped during INDEL calling. For C. elegans this defaults to HVR calls in test_data/c_elegans_mask.bed.
The minimum size cutoff for reporting an insertion or deletion (default: 50bp)
The maximum size cutoff for reporting an insertion or deletion (default: 1000bp)
default - delly-YYYYMMDD
A directory in which to output results. If you have set --debug
, the default output directory will be delly-YYYYMMDD-debug
.
└── ANNOTATE_VCF
├── AB1_indels_filtered.vcf.gz
├── AB1_indels_filtered.vcf.gz.tbi
├── AB1_indels_unfiltered.vcf.gz
├── AB1_indels_unfiltered.vcf.gz.tbi
├── MY23_indels_filtered.vcf.gz
└── MY23_indels_filtered.vcf.gz.tbi
├── MY23_indels_unfiltered.vcf.gz
└── MY23_indels_unfiltered.vcf.gz.tbi
andersenlab/delly
(link): Docker image is created within this pipeline using GitHub actions. Whenever a change is made toenv/Dockerfile
or.github/workflows/build_docker.yml
GitHub actions will create a new docker image and push if successfulandersenlab/annotation
(link): Docker image is created within this pipeline using GitHub actions. Whenever a change is made toenv/annotation.Dockerfile
or.github/workflows/build_docker.yml
GitHub actions will create a new docker image and push if successful
Make sure that you add the following code to your ~/.bash_profile
. This line makes sure that any singularity images you download will go to a shared location on /vast/eande106
for other users to take advantage of (without them also having to download the same image).
# add singularity cache
export SINGULARITY_CACHEDIR='/vast/eande106/singularity/'