Perceiver CPI (Version 1.0)

A Pytorch Implementation of paper:

PerceiverCPI: A nested cross-attention network for compound-protein interaction prediction

Ngoc-Quang Nguyen , Gwanghoon Jang , Hajung Kim and Jaewoo Kang

Our reposistory uses https://github.com/chemprop/chemprop as a backbone for compound information extraction. We highly recommend researchers read the paper D-MPNN to better understand how it was used.

Motivation: Compound-protein interaction (CPI) plays an essential role in drug discovery and is performed via expensive molecular docking simulations. Many artificial intelligence-based approaches have been proposed in this regard. Recently, two types of models have accomplished promising results in exploiting molecular information: graph convolutional neural networks that construct a learned molecular representation from a graph structure (atoms and bonds), and neural networks that can be applied to compute on descriptors or fingerprints of molecules. However, the superiority of one method over the other is yet to be determined. Modern studies have endeavored to aggregate information that is extracted from compounds and proteins to form the CPI task. Nonetheless, these approaches have used a simple concatenation to combine them, which cannot fully capture the interaction between such information.

Results: We propose the Perceiver CPI network, which adopts a cross-attention mechanism to improve the learning ability of the representation of drug and target interactions and exploits the rich information obtained from extended-connectivity fingerprints to improve the performance. We evaluated Perceiver CPI on three main datasets, Davis, KIBA, and Metz, to compare the performance of our proposed model with that of state-of-the-art methods. The proposed method achieved satisfactory performance and exhibited significant improvements over previous approaches in all experiments

0.Overview of Perceiver CPI

Set up the environment:

In our experiment we use, Python 3.9 with PyTorch 1.7.1 + CUDA 10.1.

git clone https://github.com/dmis-lab/PerceiverCPI.git
conda env create -f environment.yml

1.Dataset and supplementary experiments

The data should be in the format csv: 'smiles','sequences','label'!

2.To train the model:

python train.py --data_path "datasetpath" --separate_val_path "validationpath" --separate_test_path "testpath" --metric mse --dataset_type regression --save_dir "checkpointpath" --target_columns label

Usage Example:

python train.py --data_path ./toy_dataset/novel_pair_0_train.csv --separate_val_path ./toy_dataset/novel_pair_0_val.csv --separate_test_path ./toy_dataset/novel_pair_0_test.csv --metric mse --dataset_type regression --save_dir regression_150_newprot_pre --target_columns label --epochs 150 --ensemble_size 3 --num_folds 1 --batch_size 50 --aggregation mean --dropout 0.1 --save_preds

3.To take the inferrence:

python predict.py --test_path "testdatapath" --checkpoint_dir "checkpointpath" --preds_path "predictionpath.csv"

Usage Example:

python predict.py --test_path ./toy_dataset/novel_pair_0_test.csv --checkpoint_dir regression_150_newprot_pre --preds_path newnew_fold0.csv

4.To train YOUR model:

Your data should be in the format csv, and the column names are: 'smiles','sequences','label'.

You can freely tune the hyperparameter for your best performance (but highly recommend using the Bayesian optimization package).

5.Citations

If you find the models useful in your research, please consider citing the paper:

@article{10.1093/bioinformatics/btac731,
    author = {Nguyen, Ngoc-Quang and Jang, Gwanghoon and Kim, Hajung and Kang, Jaewoo},
    title = "{Perceiver CPI: A nested cross-attention network for compound-protein interaction prediction}",
    journal = {Bioinformatics},
    year = {2022},
    month = {11},
    abstract = "{Compound-protein interaction (CPI) plays an essential role in drug discovery and is performed via expensive molecular docking simulations. Many artificial intelligence-based approaches have been proposed in this regard. Recently, two types of models have accomplished promising results in exploiting molecular information: graph convolutional neural networks that construct a learned molecular representation from a graph structure (atoms and bonds), and neural networks that can be applied to compute on descriptors or fingerprints of molecules. However, the superiority of one method over the other is yet to be determined. Modern studies have endeavored to aggregate information that is extracted from compounds and proteins to form the CPI task. Nonetheless, these approaches have used a simple concatenation to combine them, which cannot fully capture the interaction between such information.We propose the Perceiver CPI network, which adopts a cross-attention mechanism to improve the learning ability of the representation of drug and target interactions and exploits the rich information obtained from extended-connectivity fingerprints to improve the performance. We evaluated Perceiver CPI on three main datasets, Davis, KIBA, and Metz, to compare the performance of our proposed model with that of state-of-the-art methods. The proposed method achieved satisfactory performance and exhibited significant improvements over previous approaches in all experiments.Perceiver CPI is available at https://github.com/dmis-lab/PerceiverCPISupplementary data are available at Bioinformatics online.}",
    issn = {1367-4803},
    doi = {10.1093/bioinformatics/btac731},
    url = {https://doi.org/10.1093/bioinformatics/btac731},
    note = {btac731},
    eprint = {https://academic.oup.com/bioinformatics/advance-article-pdf/doi/10.1093/bioinformatics/btac731/47214739/btac731.pdf},
}

Name		Name	Last commit message	Last commit date
Latest commit History 70 Commits
chemprop		chemprop
deeppupose		deeppupose
toy_dataset		toy_dataset
LICENSE.txt		LICENSE.txt
README.md		README.md
environment.yml		environment.yml
hyperparameter_optimization.py		hyperparameter_optimization.py
predict.py		predict.py
prepare_data_for_setting.py		prepare_data_for_setting.py
run.sh		run.sh
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Perceiver CPI (Version 1.0)

0.Overview of Perceiver CPI

1.Dataset and supplementary experiments

2.To train the model:

3.To take the inferrence:

4.To train YOUR model:

5.Citations

About

Releases

Packages

Languages

License

dmis-lab/PerceiverCPI

Folders and files

Latest commit

History

Repository files navigation

Perceiver CPI (Version 1.0)

0.Overview of Perceiver CPI

1.Dataset and supplementary experiments

2.To train the model:

3.To take the inferrence:

4.To train YOUR model:

5.Citations

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages