Skip to content

Implementation of DiffPack: A Torsional Diffusion Model for Autoregressive Protein Side-Chain Packing

License

Notifications You must be signed in to change notification settings

DeepGraphLearning/DiffPack

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DiffPack: A Torsional Diffusion Model for Autoregressive Protein Side-Chain Packing

DiffPack is a novel torsional diffusion model designed for predicting the conformation of protein side-chains based on their backbones, as introduced in arxiv link. By learning the joint distribution of side-chain torsional angles through a process of diffusing and denoising on the torsional space, DiffPack significantly improves angle accuracy across various benchmarks for protein side-chain packing.

Installation

You can install DiffPack with the following commands, which will install all the dependencies.

conda create -n diffpack python=3.8
conda activate diffpack
conda install pytorch torchvision torchaudio pytorch-cuda=11.7 -c pytorch -c nvidia
conda install pyg -c pyg
conda install torchdrug -c milagraph -c conda-forge -c pytorch -c pyg
pip install biopython==1.77
pip install pyyaml
pip install easydict

framwork

Model Checkpoints

We provide several versions of DiffPack, each with its own configuration and checkpoint:

Model Config Checkpoint
DiffPack (Vanila) Config Google Drive Link
DiffPack (with Confidence Prediction) Config Google Drive Link

The Vanilla version of DiffPack is the base model, while the version with Confidence Prediction includes an additional feature that estimates the confidence score of the predicted side-chain conformation.

Most of the configuration is specified in the configuration file. We list some important configuration hyperparameters here:

  • mode: Backward mode in diffusion process. We use ode or sde for DiffPack.
  • annealed_temp: Annealing temperature in diffusion process. We use 3 for DiffPack. Ideally, higher value corresponds to lower temperature.
  • num_sample: Number of samples in diffusion process. Confidence model will decide which sample to use.

Running DiffPack

To use DiffPack for new proteins on your local machine, we provide the necessary configuration files in the config folder. For instance, if you have two pdb files 1a3a.pdb and 1a3b.pdb, you can run the following command to infer new proteins and save the results in your chosen output folder:

python script/inference.py -c config/inference_confidence.yaml \
                           --seed 2023 \
                           --output_dir path/to/output \
                           --pdb_files 1a3a.pdb 1a3b.pdb ...

This command will generate and save the predicted side-chain conformations for the given proteins.

Retraining DiffPack

For those interested in training DiffPack on their own datasets, we will soon release the code and instructions for this process. Stay tuned for updates!

Visualization of Results

Visualization

License

This project is licensed under the MIT License - see the LICENSE file for details.

Citation

If you find DiffPack useful in your research or project, please cite our paper:

@article{zhang2023diffpack,
  title={DiffPack: A Torsional Diffusion Model for Autoregressive Protein Side-Chain Packing},
  author={Zhang, Yangtian and Zhang, Zuobai and Zhong, Bozitao and Misra, Sanchit and Tang, Jian},
  journal={arXiv preprint arXiv:2306.01794},
  year={2023}
}