Skip to content

amyguo1997/dynamic_protein_design

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Deep learning guided design of dynamic proteins

Code accompanying "Deep learning guided design of dynamic proteins" by Amy B. Guo, Deniz Akpinaroglu, Mark J.S. Kelly, and Tanja Kortemme.

Data, scripts, and molecular dynamics trajectories can be downloaded here: https://doi.org/10.5061/dryad.m37pvmdbm.

Design approach

Generating non-native alternative states (single-state design)

Given a starting functional backbone (state 1), we first generate alternative conformations (state 2) using loop-helix-loop unit combinatorial sampling. Detailed documentation on this Python-based method can be found at: https://github.com/Kortemme-Lab/loop_helix_loop_reshaping. Job scripts, analysis scripts, and inputs specific for this study can be found in state_2_generation/loop_helix_loop_reshaping.

To evaluate the designability of each backbone, we then perform single-state design on each generated backbone. Scripts for (1) mutating residue positions in the alternative state essential for forming the functional motif (Ca2+ binding site), creating a design information file specifying designable/repackable residues, and running Rosetta LayerDesign can be found in state_2_generation/single_state_design. The lowest energy design is then evaluated using Rosetta ab initio structure prediction. Scripts for fragment generation and biased forward folding can be found in state_2_generation/rosetta_abinitio.

Deep-learning guided multi-state design

A high-sequence identity design predicted to fold into the state 2 backbone was identified by evaluating in silico mutations increasing sequence identity to state 1 with ColabFold. A helper script for generating fasta files (as input for ColabFold) can be found in multi_state_design/ColabFold. Designable residues sampled during multi-state design were restricted to only positions with differing amino acids between states and their neighbors. The job submission script used in this study is included in multi_state_design/ProteinMPNN_scripts. The resulting designs were evaluated by ColabFold. Scripts for evaluation can be found in multi_state_design/ColabFold.

Analysis of dynamic designs

Dynamic designs were characterized using nuclear magnetic resonance (NMR) and molecular dynamics simulations. Scripts for analyzing NOESY-derived distance restraints and approximating the change in local chemical environment at each residue position can be found in data_analysis/NMR. Detailed documentation on mutual information analysis can be found at https://github.com/stefdoerr/mutinf and https://simtk.org/projects/mutinf.

Note: The PDB files available in our Dryad repository have been renumbered to be consistent with our experimentally solved structures deposited in the PDB (i.e. indexed starting from 1 including the 4 residue N-terminal thrombin cleavage site scar - if the scar is not modeled explicitly, then the numbering begins from 5). However, the single-state and multi-state design scripts using Rosetta/ProteinMPNN use a numbering system where the first residue of the PDB file is indexed as position one (regardless of what residue number is assigned in the PDB file itself), as is standard for Rosetta/ProteinMPNN software.

About

Code for deep learning guided design of dynamic proteins

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published