-
Notifications
You must be signed in to change notification settings - Fork 1
Binomial SNP-caller from Pileup (Biscap) and comparison of FDR scripts
License
rhysf/Biscap
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
2021 Note: Biscap_legacy.pl is described in Farrer et al 2013 Sci Rep. It has NOT been actively developed, and other variant callers are recomended. However, some of the utility scripts including GBID, IRMS, and CFDR still have a use case assessing variant caller quality. Furthermore, the code for Biscap has been tidied up, and may be developed and tested in the future. NAME Binomial SNP Caller from Pileup and Comparison of false discovery rates SYNOPSIS GBiD.pl -x 200 -p 0.5 IRMS.pl -g ref.fasta -n 10000 BiSCaP.pl -p aln.pileup -r ref.fasta CFDR.pl -i IntroducedMutations -f FoundMutations -s SAMfile -p Pileup DESCRIPTION This a collection of scripts that aim to perform variant calling and assess the quality of alignment and variant calling. Introduce Random Mutations into a sequence (IRMS) simulates any number of random single nucleotide polymorphisms or indels, which are placed randomly within the genome or randomly within a specific feature type (Specified by a feature file). After aligning to the modified fasta sequence generated by IRMS, and SNP calling using any method that provides the final calls in the variant call format. Comparison of false discovery rates (CFDR) can next be used to assess the most suitable method of alignment and snp-calling. Binomial SNP Caller from Pileup (BiSCaP) is a method of calling both homozygous and heterozygous SNPs and Indels using a look up table of cumulative binomial probabilities. COMMANDS AND OPTIONS GBiD.pl perl GBiD.pl -x MaxDepth -p ProbOfError If a binomial lookup table is wanted other than the 0.1 and 0.01 provided, this script will produce those probabilities using R. Requires Statistics::R, and has required to run in steps (E.g. -x 50 -> -x 100 -> -x 150 etc. IRMS.pl perl IRMS.pl -g ref.fasta -n NumOfMutations <optional params> Introducing Random Mutations into a sequence, which can then be used to assess the best alignment and SNP calling params. CNV duplicates random genes. HETTRIP is currently untested. OPTIONS: -t Type of mutation to introduce (SNP/DEL/INS/HET/CNV/HETTRIP) INS/DEL not supported when using -c/-f [SNP] -c A feature file (GFF/GTF) that specifies where the mutations are placed in the genome. [anywhere] -f Feature in the GFF/GTF to mutate (CDS, exon, mRNA etc.) Biscap.pl perl Biscap.pl -p aln.pileup -r ref.fasta <optional params> Binomial SNP Caller from Pileup uses a look up table of Binomial probabilities to calculate the consensus sequence from a pileup. The default output of the program is a file named <pileup>-All-mutations-Samtools_format-<settings>.tab. Columns describe in order: contig, position, reference base, consensus base, average mapping quality, average base quality, maximum mapping, depth, aligned bases and read qualities. In a seperate file are positions in the genome that are outside of the binomial distribution look-up table, which have not been categorised, a summary of the mutations found, and a tally of different read depths through the alignment. OPTIONS: -m Minimum read depth to be considered a mutation [4] -e Probability of error [0.1] (the accuracy of using the non-default error rate of 0.01 is currently untested) -g If depth > max depth in look-up table, analyse up to max depth (y/n) [y] -q Read Quality minimum cut-off for SNPs (e.g. 10, 20...) [0] -n Sample name for VCF [WGS] CFDR.pl perl CFDR.pl -i IntroducedMutations -m FoundMutations -p Pileup <optional params> Comparison of False Discovery Rates (CFDR) should be used on an alignment to the output fasta file of IRMS, and that has had SNPs called (optionally by BiSCaP). OPTIONS: -i Full list of introduced mutations (details output of IRMS) -m SNP-calls in Variant Call Format (VCF) -s SAM file of alignment -p Pileup of alignment -c GFF/GTF (required -s) -f Feature in the GFF/GTF to CFDR (CDS, exon, mRNA etc.) -o Output folder location. [folder of found mutations] RELEASES v0.12 * Code tidy up and refinement. v0.111 (2013) * Fixed error printing homozygous agree lines * Fixed error finding long indels when depth > max depth and optg is used. v0.11 (9 October 2012) * Massive update. Single subroutine now calls probabilities and tests for triallelic positions. * CFDR can now introduce random heterozygous mutations (either bi- or tri-alleles) for testing with simulated reads. * test.pileup and test.reference.fasta are included for testing Biscap. VCF's from Biscap can then be checked with check_test_VCF.pl. v0.1 (14 Febuary 2012) * First upload.
About
Binomial SNP-caller from Pileup (Biscap) and comparison of FDR scripts
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published