Skip to content

a basic workflow for running seekdeep pipelines in snakemake

Notifications You must be signed in to change notification settings

bailey-lab/seekdeep_illumina_snakemake

Repository files navigation

seekdeep_illumina_snakemake

a basic workflow for running Nick Hathaway's seekdeep on illumina. This version splits up jobs into individual snakemake submissions.

Installation:

sudo apt install /full/path/to/deb/file)
mamba create -c conda-forge -c bioconda -n snakemake snakemake
mamba activate snakemake

Setup your environment:

  • Change directory to a folder where you want to run the analysis
  • clone this repository with:
git clone https://github.com/bailey-lab/seekdeep_illumina_snakemake.git
  • change directory to the cloned repo (seekdeep_illumina_snakemake)
  • download the elucidator.sif file and the tutorial dataset with:
bash download_example_dataset.sh

Usage:

  • Edit the seekdeep_illumina_general.yaml file using the instructions in the comments. Use a text editor that outputs unix line endings (e.g. vscode, notepad++, gedit, micro, emacs, vim, vi, etc.)
  • If snakemake is not your active conda environment, activate snakemake with:
mamba activate snakemake
  • Run all steps with (e.g. if you have 8 cores available on your machine):
snakemake -s setup_run.smk --cores 8
snakemake -s run_extractor.smk --cores 8
snakemake -s finish_process.smk --cores 8
  • You can also run all steps with:
bash run_all_steps.sh

Help:

You can read Nick Hathaway's manual here: https://seekdeep.brown.edu/

If you're in the folder where you downloaded the elucidator.sif file, you can get help on any seekdeep command with:

singularity exec elucidator.sif SeekDeep [cmd] -h

three main commands in the snakefile.

  • The first command gets info about the genome (genTargetInfoFromGenomes).
  • The second command sets up an analysis run (setupTarAmpAnalysis).
  • The third command runs 3 seekdeep programs (runAnalysis.sh, no help files).

Here are some example help commands to learn more about these commands:

  • singularity exec elucidator.sif SeekDeep -h
  • singularity exec elucidator.sif SeekDeep genTargetInfoFromGenomes -h
  • singularity exec elucidator.sif SeekDeep setupTarAmpAnalysis -h

three sub-steps of running seekdeep.

Each of these steps can be tweaked for sensitivity and specificity (via extra_ [step]_cmds at the bottom of the yaml file):

  • The first command extracts amplicon reads (extractor)
  • The second command clusters together similar reads (qluster)
  • The third command processes clusters into haplotypes (processClusters)

Here are some example help commands to learn more about these programs:

  • singularity exec elucidator.sif SeekDeep extractor -h
  • singularity exec elucidator.sif SeekDeep qluster -h
  • singularity exec elucidator.sif SeekDeep processClusters -h

About

a basic workflow for running seekdeep pipelines in snakemake

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published