Skip to content

02 Quantification

Ryan Schubert edited this page Oct 4, 2018 · 3 revisions

Quantification from salmon pseudoalignment

If in the previous step you ran salmon to pseudoalign your fastq files then congratulations! You have already completed the steps necessary to quantify your RNA-Seq data. Check that the output files are not empty. You should also check the total_populations folder is not empty. This is where the parsed and filtered salmon output goes. You may notice that some snps are missing. That is because parsing automatically removes snps that do not meet thresholds of mean and variance across samples. If you wish to create population files on a smaller subset of your sample population you can simply run Salmon_parser.R again. This script automatically searches for all the relevant quant files in a directory and its sub directories. To run on a smaller subset simply remove all samples you don't wish to run on. See salmon_loop page for more details and options.

Salmon_parser.R example

mkdir outputdir
Rscript Salmon_parser.R -q $PATH/to/quantification/directory/ -a $PATH/$TO/annotations.gencode.gtf -o outputdir 

Quantification from STAR or other alignment

If the user already has alignment files (BAM/SAM) then salmon can still run to quantify these with relative speed. The arguments are relatively similar to running salmon for pseudoalignment. The key difference is that instead of supplying the directory containing fastq files one must supply the directory containing the BAM files with the -b option. One also does not need to supply an index or run indexing for this step. Please see the salmon_loop page for the full list of options.

Example

./salmon_loop -t $PATH/$TO/transcriptome.fa -b $PATH/$TO/BAMdirectory/ -a $PATH/$TO/annotations.gencode.gtf -s $PATH/$TO/sample_list.txt