Skip to content

Latest commit

 

History

History
 
 

binding_affinity_prediction

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 

Binding Affinity Prediction

Datasets

  • PDBBind: The PDBBind dataset in MoleculeNet [1] processed from the PDBBind database. The PDBBind database consists of experimentally measured binding affinities for bio-molecular complexes [2], [3]. It provides detailed 3D Cartesian coordinates of both ligands and their target proteins derived from experimental(e.g., X-ray crystallography) measurements. The availability of coordinates of the protein-ligand complexes permits structure-based featurization that is aware of the protein-ligand binding geometry. The authors of [1] use the "refined" and "core" subsets of the database [4], more carefully processed for data artifacts, as additional benchmarking targets.

Models

  • Atomic Convolutional Networks (ACNN) [5]: Constructs nearest neighbor graphs separately for the ligand, protein and complex based on the 3D coordinates of the atoms and predicts the binding free energy.

Usage

Use main.py with arguments

-m {ACNN}, Model to use
-d {PDBBind_core_pocket_random, PDBBind_core_pocket_scaffold, PDBBind_core_pocket_stratified, 
PDBBind_core_pocket_temporal, PDBBind_refined_pocket_random, PDBBind_refined_pocket_scaffold, 
PDBBind_refined_pocket_stratified, PDBBind_refined_pocket_temporal}, dataset and splitting method to use

Performance

PDBBind

ACNN

Subset Splitting Method Test MAE Test R2
Core Random 1.7688 0.1511
Core Scaffold 2.5420 0.1471
Core Stratified 1.7419 0.1520
Core Temporal 1.9543 0.1640
Refined Random 1.1948 0.4373
Refined Scaffold 1.4021 0.2086
Refined Stratified 1.6376 0.3050
Refined Temporal 1.2457 0.3438

Speed

ACNN

Comparing to the DeepChem's implementation, we achieve a speedup by roughly 3.3 for training time per epoch (from 1.40s to 0.42s). If we do not care about randomness introduced by some kernel optimization, we can achieve a speedup by roughly 4.4 (from 1.40s to 0.32s).

References

[1] Wu et al. (2017) MoleculeNet: a benchmark for molecular machine learning. Chemical Science 9, 513-530.

[2] Wang et al. (2004) The PDBbind database: collection of binding affinities for protein-ligand complexes with known three-dimensional structures. J Med Chem 3;47(12):2977-80.

[3] Wang et al. (2005) The PDBbind database: methodologies and updates. J Med Chem 16;48(12):4111-9.

[4] Liu et al. (2015) PDB-wide collection of binding data: current status of the PDBbind database. Bioinformatics 1;31(3):405-12.

[5] Gomes et al. (2017) Atomic Convolutional Networks for Predicting Protein-Ligand Binding Affinity. arXiv preprint arXiv:1703.10603.