- Clone the repository:
git clone https://github.com/Simran-Sodhi/drug-design-dynamics.git
PLAS20K/
: Contains datasets related to the PLAS20K projectRshiny/
: Scripts and resources for the R Shiny applicationSPICE/
: Data and parsing scripts pertaining to the SPICE project // we are not using this data anymoremachineLearningMethods/
: Machine learning models and training scriptsmetropolisMethod/
: Implementation of the Metropolis algorithm
- usePDBnames.go to extract all the pdb_id of protein-ligand complex used in PLAS20K (stored in PLAS20K_pdb_ids.txt)
- use batch_download.sh to grab all the pdb files from RSCB
- use splitPDB.go to seperate proteins and ligands (output two pdb files for proteins and ligands)
- use convert_pdb_to_mol2.sh (calls Open Babel) to convert all pdb files to mol2 files.
- You can use the metropolisMethod/main.go to run the metropolis simulation
- You need to provide data in metropolisMethod/Data. Some sample data is present there
- In main.go there are three options: one to simulate multiple ligands RunMultipleLigands(), one to get RMSD values: TestMethodRMSD() and the third for the R Shiny app: RShinyAppMain(args []string)
- All the outputs go into the metropolisMethod/Output folder
Interactive web app for predicting protein-ligand interations by evaluating their binding energies using Metropolis and machine learning simulations method.
- Set your Python path at Line 241 in app.R to ensure that the ML python script can be executed.
- Put all external data under folder either ./MCdata or ./MLdata for data to be accessible (like the example data).
-
Metropolis:
Tha app takes one protein file (.pdb/.mol2) and multiple ligand files (.mol2 files uploaded together in a directory) as input,
after successfully uploaded the data, hit "Run Simulation" button, and you will get:
(1) the plot of all protein-ligand pairs' binding energies
(2) the structure of protein-ligand pair with minimum binding energy shown
(need to preprocess the output protein and ligand file (.mol2) using Chimera or PyMOL to generate the video and save as ./output/results.mp4)
-
Machine Learning:
The app takes protein-ligand interaction dataset (.csv file) as input,
you can then select features you are interested in from ["electrostatic", "polar_solvation", "non_polar_solvation", "vdW"],
and select models you would like to use from ["RandomForest", "DecisionTree", "XGBoost", "LightGBM", "SVM"],
after successfully uploaded the data and selected the parameters, hit "Run Simulation" button, and you will get:
(1) the evalution matrix of all models (evalution method includes: MSE, RMSE, MAE, R2, MAPE)
(2) feature importance tables generated by models you selected
https://drive.google.com/drive/folders/1g_GTiWV2_l0lUbO9OTQ9euYVeeyWyaBW?usp=drive_link