Skip to content

Official implementation for the paper "EBMDOCK: NEURAL PROBABILISTIC PROTEIN-PROTEIN DOCKING VIA A DIFFERENTIABLE ENERGY-BASED MODEL" (ICLR 2024).

Notifications You must be signed in to change notification settings

wuhuaijin/EBMDock

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Apr 24, 2024
6726358 · Apr 24, 2024

History

3 Commits
Apr 24, 2024
Apr 24, 2024
Apr 24, 2024
Apr 24, 2024
Apr 24, 2024
Apr 24, 2024
Apr 24, 2024
Apr 24, 2024
Apr 24, 2024
Apr 24, 2024

Repository files navigation

EBMDock: Neural Probabilistic Protein-Protein Docking via a Differentiable Energy Model

Official implementation for the paper "EBMDOCK: NEURAL PROBABILISTIC PROTEIN-PROTEIN DOCKING VIA A DIFFERENTIABLE ENERGY-BASED MODEL" (ICLR 2024). https://openreview.net/pdf?id=qg2boc2AwU

1713966338028

Dependencies

This work was developed and tested under pytorch 1.12.1 with CUDA 11.6. Please install main dependencies as follows:

conda create -n dock python=3.9
conda activate dock

conda install pytorch==1.12.1 torchvision==0.13.1 torchaudio==0.12.1 cudatoolkit=11.6 -c pytorch -c conda-forge
conda install pyg -c pyg


pip install pyg_lib torch_scatter torch_sparse torch_cluster torch_spline_conv torch_geometric -f https://data.pyg.org/whl/torch-1.12.1+cu116.html

pip install pandas biopandas dill biopandas e3nn biopython==1.80 horovod[pytorch]==0.27.0 
pip install networkx==2.6.3 dgl==1.1.3 dglgo==0.0.2 dgllife==0.3.2

Data

DB5.5

We have provided raw data in data/DB5/structures.

Antibody-antigen

We follow ''trimodock: Rigid protein docking via cross-modal representation learning and spectral algorithm'' and select 68 antibody-antigen samples from the Protein Data Bank which were released after October 2022.

We have provided the raw data in data/antibody-antigen.

DIPS-Het

We use DIPS-Het to train our model. Input data to this dataset generation pipeline can be downloaded from the RCSB database, further steps please refer to data/DIPS-Het

Reproducibility

  1. You can train an energy model by running:
python train.py -ebm_train -n_gaussians 6 -dataset dipshet -data_path ./data/DIPS-Het

Use -ebm_train is a little bit time consuming, you can also use -ll_train instead (which directly maximize the log_likelihood of distances).

  1. You can train an interface prediction model by running:
python train.py -cp -dataset dipshet -data_path ./data/DIPS-Het
  1. You can also train 1,2 at the same time by running:
python train.py -cp -dataset dipshet -data_path ./data/DIPS-Het -ebm_train -n_gaussians 6
  1. We have provided the pretrained model in checkpoints/, you can evaluate a trained model by running:
python evaluate.py -dataset db5 -data_path ./data/DB5 -intersection_loss -known_interface

Here -known_interface means when we have a rough knowledge of the interaction interface. You can also use the trained model to predict the interaction interface (just remove -knwon_interface).

  1. You can predict the docking pose of any two proteins by running:
python inference.py -receptor_file_path [your receptor file path] -ligand_file_path [your ligand file path] -output_ligand_path  [the output ligand file path] 

This will generate the predicted ligand pose and the result pdb will be saved in -output_ligand_path.

Citation

If you use our code or method in your work, please consider citing the following:

@inproceedings{wu2023neural,
  title={EBMDock: Neural Probabilistic Protein-Protein Docking via a Differentiable Energy Model},
  author={Wu, Huaijin and Liu, Wei and Bian, Yatao and Wu, Jiaxiang and Yang, Nianzu and Yan, Junchi},
  booktitle={The Twelfth International Conference on Learning Representations},
  year={2024}
}

Please direct any questions to Huaijin Wu (whj1201@sjtu.edu.cn)

Acknowledgement

We appreciate the following works for their valuable code and data:

https://github.com/octavian-ganea/equidock_public

https://github.com/bytedance/HMR

https://github.com/atomicarchitects/equiformer

About

Official implementation for the paper "EBMDOCK: NEURAL PROBABILISTIC PROTEIN-PROTEIN DOCKING VIA A DIFFERENTIABLE ENERGY-BASED MODEL" (ICLR 2024).

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published