Skip to content
/ Biscap Public

Binomial SNP-caller from Pileup (Biscap) and comparison of FDR scripts

License

Notifications You must be signed in to change notification settings

rhysf/Biscap

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

2021 Note: Biscap_legacy.pl is described in Farrer et al 2013 Sci Rep.
It has NOT been actively developed, and other variant callers are 
recomended. However, some of the utility scripts including GBID, IRMS,
and CFDR still have a use case assessing variant caller quality. Furthermore,
the code for Biscap has been tidied up, and may be developed and tested
in the future.

NAME
      
       Binomial SNP Caller from Pileup and Comparison of false discovery rates

SYNOPSIS

       GBiD.pl -x 200 -p 0.5 
       
       IRMS.pl -g ref.fasta -n 10000 

       BiSCaP.pl -p aln.pileup -r ref.fasta 

       CFDR.pl -i IntroducedMutations -f FoundMutations -s SAMfile -p Pileup
       

DESCRIPTION

       This a collection of scripts that aim to perform variant calling and  
       assess the quality of alignment and variant calling. Introduce Random 
       Mutations into a sequence (IRMS) simulates any number of random single 
       nucleotide polymorphisms or indels, which are placed randomly within the 
       genome or randomly within a specific feature type (Specified by a feature 
       file). After aligning to the modified fasta sequence generated by IRMS, 
       and SNP calling using any method that provides the final calls in the
       variant call format. Comparison of false discovery rates (CFDR) 
       can next be used to assess the most suitable method of alignment and 
       snp-calling.
       
       Binomial SNP Caller from Pileup (BiSCaP) is a method of calling both 
       homozygous and heterozygous SNPs and Indels using a look up table of 
       cumulative binomial probabilities.


COMMANDS AND OPTIONS

       GBiD.pl    perl GBiD.pl -x MaxDepth -p ProbOfError
       
                  If a binomial lookup table is wanted other than the 0.1 and 
                  0.01 provided, this script will produce those probabilities
                  using R. Requires Statistics::R, and has required to run in
                  steps (E.g. -x 50 -> -x 100 -> -x 150 etc.

                  
       IRMS.pl    perl IRMS.pl -g ref.fasta -n NumOfMutations <optional params>
       
                  Introducing Random Mutations into a sequence, which can then
                  be used to assess the best alignment and SNP calling params.
                  CNV duplicates random genes. HETTRIP is currently untested.
                 
                  OPTIONS:
                 
                  -t      Type of mutation to introduce (SNP/DEL/INS/HET/CNV/HETTRIP) 
                          INS/DEL not supported when using -c/-f [SNP]
                         
                  -c      A feature file (GFF/GTF) that specifies where the 
                          mutations are placed in the genome. [anywhere]
                         
                  -f      Feature in the GFF/GTF to mutate (CDS, exon, mRNA etc.)


        Biscap.pl perl Biscap.pl -p aln.pileup -r ref.fasta <optional params>
        
                  Binomial SNP Caller from Pileup uses a look up table of 
                  Binomial probabilities to calculate the consensus sequence
                  from a pileup. The default output of the program is a file 
                  named <pileup>-All-mutations-Samtools_format-<settings>.tab.
                  Columns describe in order: contig, position, reference base,
                  consensus base, average mapping quality, average base quality,
                  maximum mapping, depth, aligned bases and read qualities. 
                  In a seperate file are positions in the genome that are outside
                  of the binomial distribution look-up table, which have not been
                  categorised, a summary of the mutations found, and a tally of
                  different read depths through the alignment.
          
                  OPTIONS:
                  
                  -m      Minimum read depth to be considered a mutation [4]
                  
                  -e      Probability of error [0.1] (the accuracy of using the
                  	  non-default error rate of 0.01 is currently untested)
                  
                  -g      If depth > max depth in look-up table, analyse up to 
                          max depth (y/n) [y]
                  
                  -q      Read Quality minimum cut-off for SNPs (e.g. 10, 20...) 
                          [0]
                          
                  -n      Sample name for VCF [WGS]

        CFDR.pl   perl CFDR.pl -i IntroducedMutations -m FoundMutations 
                  -p Pileup <optional params>
                          
                  Comparison of False Discovery Rates (CFDR) should be used on 
                  an alignment to the output fasta file of IRMS, and that has 
                  had SNPs called (optionally by BiSCaP).
                  
                  OPTIONS:
                  
                  -i      Full list of introduced mutations (details output of 
                          IRMS)
                          
                  -m      SNP-calls in Variant Call Format (VCF)
                  
                  -s      SAM file of alignment
                  
                  -p      Pileup of alignment
                  
                  -c      GFF/GTF (required -s)
                  
                  -f      Feature in the GFF/GTF to CFDR (CDS, exon, mRNA etc.)
                  
                  -o      Output folder location. [folder of found mutations]
                  
RELEASES

v0.12

* Code tidy up and refinement.

v0.111 (2013)

* Fixed error printing homozygous agree lines
* Fixed error finding long indels when depth > max depth and optg is used.

v0.11 (9 October 2012)

* Massive update. Single subroutine now calls probabilities and tests for 
  triallelic positions.

* CFDR can now introduce random heterozygous mutations (either bi- or 
  tri-alleles) for testing with simulated reads. 
  
* test.pileup and test.reference.fasta are included for testing Biscap. VCF's 
  from Biscap can then be checked with check_test_VCF.pl. 
  
v0.1 (14 Febuary 2012)

* First upload.

About

Binomial SNP-caller from Pileup (Biscap) and comparison of FDR scripts

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages