Saturation Mutagenesis-Reinforced Functional Assays (SMuRF) is an accessible workflow for deep mutational scanning (DMS) studies. The manuscript is currently available on BioRxiv. This workflow involves a 2-way extension PCR-based saturation mutagenesis and high-throughput functional assays that can be evaluated with short-read Next-generation sequencing.
We have organized scripts used for SMuRF across three repositories:
- Balthazar (this repository) is used to create the oligo library needed for the saturation mutagenesis as well as perform QC for the resulting construct.
- Gagarmel repository is used to process the raw data from NGS to quantify the enrichment of the variants.
- Azrael repository is used to generate and analyze the functional scores, and plot the results.
- Python (tested on v2.7.16 or v3.6)
- BioPython (tested on v1.78)
- R (tested on v4.1.2)
- BWA (tested on v0.7.15)
- SAMtools
- FASTX-Toolkit (tested on v0.0.14)
- picard 2.9.0 (tested on v0.7.17)
ggplot2
- Fasta file containing coding sequence of gene of interest
- Clinvar data file subsetted to gene of interest
Files associated with this example are in all_possible_SNVs/fkrp
# Generate a file containing all possible SNVs
python mutation_classfication.py fkrp_1to1485.fasta all_possible_SNVs
# Add ClinVar annotation to this file
python add_ClinVar_to_allpossibleSNVs_FKRP.py all_possible_SNVs clinvar_result_FKRP_04202023.txt fkrp_1to1485.fasta
all_possible_SNVs_with_clinvar_FKRP_04202023
# Create a barplot in R
Rscript --vanilla all_possible_SNVs_plot.r all_possible_SNVs
Files associated with this example are in all_possible_SNVs/large1
# Generate a file containing all possible SNVs
python mutation_classfication.py large1_1to2268.fasta all_possible_SNVs
# Add ClinVar annotation to this file
python add_ClinVar_to_allpossibleSNVs_LARGE1.py all_possible_SNVs clinvar_result_04202023_LARGE1.txt large1_1to2268.fasta all_possible_SNVs_with_clinvar_LARGE1_04202023
# Create a barplot in R
Rscript --vanilla all_possible_SNVs_plot.r all_possible_SNVs
Please find the scripts and the readme.md in the cloning alert directory
# all_possible_SNVs_test.txt were generated with SNV_mutation_classification.py.
# all_possible_NNKmut_test.txt were generated with NNK_mutation_classification.py.
python SNV_mutation_classification.py test_input.fa all_possible_SNVs_test.txt 7311 8796
python NNK_mutation_classification.py test_input.fa all_possible_NNKmut_test.txt 7311 8796
This package identifies two groups of variants that might be underrepresented in the pool due to the limitation of the cloning method.
group1: will introduce a cut before the variants which will remove the variant from the top strand
group2: will introduce a cut right after the variants which might affect the elongation of the top strand
readme.md in the cloning alert directory
readme.md in the oligo generator directory
The jobs were run on Yale HPC Ruddle (RIP)
# FKRP example
Sbatch QC_FKRP.slurm
# LARGE1 example
Sbatch QC_LARGE1.slurm