Skip to content

mdelcorvo/DeSeq-Free

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DeSeq-Free

DeSeq-Free (Whole genome Deep Sequencing analysis of Cell Free tumor DNA ) is a Snakemake workflow, aimed to analyze WGS of circulating cell-free DNA (cfDNA) in the plasma of cancer patients together with their matched germline and tumour samples in a reproducible, automated, and partially contained manner. It is implemented such that alternative or similar analysis can be added or removed.

Contents

Using the DeSeq-Free workflow

We assume that you already have conda installed, otherwise you can easily install it:

To install conda: https://docs.conda.io/projects/conda/en/latest/user-guide/install/linux.html

  • Input:

    metafile (can be .xlsx or .csv) with raw fastq.gz data that looks as follows:

    sample, lane, fq1, fq2, type
    
    Sample1, lane1, S1_L001_R1_001.fastq.gz, S1_L001_R2_001.fastq.gz, 0
    Sample1, lane2, S1_L002_R1_001.fastq.gz, S1_L002_R2_001.fastq.gz, 0
    

    Each row represents a single-end fastq file. Rows with the same sample identifier are considered technical replicates and will be automatically merged. type refers to sample type (0= buffy coat, 1= plasma, 2=tumor).

    • Reference genome

      Before starting, a user need to download reference genome.

      Download from NCBI, Ensembl, or any other autorities

      wget https://ftp.ensembl.org/pub/release-100/fasta/homo_sapiens/dna/Homo_sapiens.GRCh38.dna.toplevel.fa.gz
      
      • Index reference genome for bwa-mem2

        Prepare indexed genome for bwa-mem2 to boost mapping. Refer to the bwa-mem2 instruction.

        Example code:

        ./bwa-mem2 index <in.fasta>
        Where 
        <in.fasta> is the path to reference sequence fasta file and 
        
  • Code:

    git clone https://github.com/mdelcorvo/DeSeq-Free.git
    cd DeSeq-Free && conda env create -f envs/workflow.yaml
    conda activate DeSeq-Free_workflow
    
    snakemake --use-conda \
    --config \
    input=inputfile.xlsx \
    output=output_directory \
    genome=genome.fasta
    

Output files

  • Somatic variant analysis
  • Variant allele frequency
  • Annotation of somatic variants
  • Somatic signatures
  • Analysis of somatic CNAs
  • Fragment size analysis

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

No packages published