Prosic has been succeeded by Varlociraptor. Please use Varlociraptor instead.
PROSIC is a caller for somatic variants in tumor-normal sample pairs, sequenced with any next-generation sequencing technology. It provides a novel latent variable model that integrates various levels of uncertainty, and thereby allows to properly asses the probability of having a somatic variant while controlling the false discovery rate.
PROSIC is available via Bioconda, a distribution of bioinformatics software for the conda package manager. Bioconda can be set up in any Linux environment, even without admin rights. With Bioconda set up, PROSIC can be installed via
$ conda install prosic
The purpose of PROSIC is to call somatic insertions and deletions (indels) on tumor/normal sample pairs. For this, PROSIC requires a VCF file with preliminary indel calls, e.g. obtained with Delly or Lancet. Then, calling with PROSIC consists of two steps.
Variants are called by applying PROSIC to the preliminary calls, i.e.
$ prosic call-tumor-normal --flat-priors tumor.bam normal.bam < pre-calls.vcf > prosic-calls.bcf
PROSIC then annotates the initial calls with probabilities for the events somatic, germline and absent (PROB_SOMATIC
, PROB_GERMLINE
, PROB_ABSENT
).
Issue prosic tumor-normal --help
for information about additional parameters.
To control the FDR, you first have to create a null-model by swapping tumor and normal bams.
In case of a general purpose caller like Delly, you can use the same vcf
of preliminary calls. With callers like lancet, you have to create a vcf with swapped samples.
Then, you apply prosic call-tumor-normal
(Step 1) with swapped tumor and normal bams, i.e.
$ prosic call-tumor-normal --flat-priors normal.bam tumor.bam < null-pre-calls.vcf > null-calls.bcf
Finally the FDR (here for somatic deletions) can be controlled by
$ prosic control-fdr --event SOMATIC --var DEL < null-calls.bcf > thresholds.tsv
The resulting tab-separated table thresholds.tsv
contains thresholds that can be applied to the corresponding PROB_SOMATIC
field in prosic-calls.bcf
, in order to control the FDR at different levels.
- Original model: Louis Dijkstra
- Extended model and implementation: Johannes Köster