Skip to content

avdanurag/variant-calling-bwa-gatk4

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 

Repository files navigation

Quality Filtration, Mapping, and Variant Calling Pipeline

This repository contains a set of shell scripts for performing quality filtration of Illumina raw reads, mapping with BWA, and variant calling with GATK HaplotypeCaller. The pipeline includes the following scripts:

Scripts

fastp.sh

Quality filtration of Illumina raw reads using the fastp tool.

Usage:

bash fastp.sh <input_dir> <output_dir>

bwa.sh

Mapping of quality-filtered reads to a reference genome using BWA.

Usage:

bash bwa.sh <reference_dir> <input_dir> <output_dir>

sortSAM.sh

Sorting and indexing SAM files using GATK's SortSam.

Usage:

bash sortSAM.sh <input_dir> <output_dir>

dedupBAM.sh

Deduplication of sorted BAM files using GATK's MarkDuplicates.

Usage:

bash dedupBAM.sh <input_dir> <output_dir>

haplotypecaller.sh Variant calling using GATK HaplotypeCaller and generating gVCF files.

Usage:

bash haplotypecaller.sh <reference_genome> <input_dir> <output_dir> <tmp_dir> <num_runs>

Usage

  • install fastp, bwa and gatk4

  • Clone this repository to your local machine.

  • Ensure you have the necessary dependencies installed (e.g., fastp, BWA, GATK).

  • Create a directory structure for your data, reference, and output.

  • Run each script with the required arguments according to their respective usage instructions.

Output

The output of each script includes quality-filtered reads, sorted and indexed BAM files, deduplicated BAM files, and gVCF files for variant calling.

About

Reads to variants workflow involving fastp-BWA-GATK

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages