Skip to content

thewildwilli/CatalysisPrediction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Catalysis Prediction

Catalysis is one of the main components of RAF (Reflexively Autocatalytic and Food-generated) sets1 to model pre-biotic self-sustaining chemical systems, which are thought to be precursors of life. Yet, predicting if a molecule will catalyse a reaction is an unsolved problem.

This project aims at predicting a particular case of catalysis: that of double docking. If molecule A can bind to both molecules B and C, then A may bring B and C to a favourable configuration and catalyse a reaction between them.

To this end, a fast molecular docking algorithm was developed for small molecules. Molecular docking is a well-known problem, but only for cases when at least one of the molecules is very large - e.g. a protein. This novel algorithm focuses on the particular and largely unexplored case of small molecule to small molecule docking.

This project was initially developed as part of my Master's dissertation, available here.

Fast Molecular Docking

At the present stage, this project provides a fast molecular docking algorithm. These videos show the docking program working:

DNA Hexamer example

DNA hexamer example video

Protein-ligand example

Although this project does not target large molecules specificallty, experiments were carried out for comparability. Protein-ligand example video

Running the program

You can [download a binary release] (releases/). You will need Java 1.8 and Scala 2.11.7 installed. Extract the zip and run:

java -jar CatalysisPrediction.jar [args]

You can see the list of program arguments here.

The binaries ship with some test data (testdata directory), so you can get started right away. For example, try:

java -jar CatalysisPrediction.jar -dir testdata -a 3HTB/3HTB_protein.pdb -b 3HTB/3HTB_ligand.pdb -out 3HTB/3htb_docked.mol2 -docker forcevector --ignoreAhydrogens -threshold 1.0e-5 -surface 1.4 -permeability 0.90 -balance 1,0,1,0

You may also want to edit your viewinit.txt file to adjust how molecules are visualised.

Note: program reported RMSD is in Angstroms.

The benchmarking tool will perform multiple runs and report the average RMSD and time of each. It invokes the main program for each line in file benchmarkcmds.txt. Example:

cd testdata
java -cp ../CatalysisPrediction.jar prog.Benchmarker [-r 50]

where -r specifies the number of runs for each line of benchmarkcmds.txt. This is a sample line of this file:

-a 3htb/3HTB_protein.pdb -b 3htb/3HTB_ligand.pdb -out 3htb/3htb_docked.mol2 -docker forcevector --nogui --randominit -workers 8

Compiling

Intellij Idea with Scala plug-in was used for development and is the recommended choice. Clone the repository or download sources. You may need to check that the Scala library is correctly linked to the project.

Libraries

  • Jmol is used for molecule 3D visualisation.
  • Breeze is used for linear algebra operations.
  • ThreadCSO3 is used for concurrent programming.

Additionally, SimRNA, RNA Composer, Make-NA, and OpenBabel were used to generate some of the test data, but are not dependencies of the code.


1 M. Steel and W. Hordijk, “Detecting autocatalytic, self-sustaining sets in chemical reaction systems,” Journal of theoretical biology,vol. 277, no. 4, pp. 451-461, 2004

2 Ernesto Ocampo, "A Fast Double Docking Algorithm for Catalysis Prediction", 2016. Dissertation submitted for the degree of Master of Science in Computer Science, University of Oxford, under the supervision of Jotun Hein and Peter Jeavons.

3 B. Sufrin, “Communicating Scala Objects.,” in CPA, 2008

About

CatalysisPrediction

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages