LigDream: Shape-Based Compound Generation

THIS PROJECT IS NOT LONGER ACTIVE. IT IS MADE AVAILABLE WITHOUT ANY SUPPORT.

Citing

If you are using content of the repository please consider citing the follow work:

@article{skalic2019shape,
  title={Shape-Based Generative Modeling for de-novo Drug Design},
  author={Skalic, Miha and Jim{\'e}nez Luna, Jos{\'e} and Sabbadin, Davide and De Fabritiis, Gianni},
  journal={Journal of chemical information and modeling},
  doi = {10.1021/acs.jcim.8b00706},
  publisher={ACS Publications}
}

Requirements

Model training is written in pytorch==0.3.1 and uses keras==2.2.2 for data loaders. RDKit==2017.09.2.0 and HTMD==1.13.9 are needed for molecule manipulation.

Add the repo to your pythonpath

  export PYTHONPATH=/path/to/ligdream/repo/:$PYTHONPATH

Before starting

For the training a smi file is needed. We used subset of the Zinc15 dataset, using only the drug-like. The same cleaned dataset can be retrieve by using the getDataset.sh script. The latter will download the smi file required for the training (see next section).

  bash getDataset.sh

In the traindataset folder there will be the zinc15_druglike_clean_canonical_max60.smi file that is required for the training step (see next section).

For the generation stage the model files are necessary. It is possible to use the ones that are generated during the training step or you can download the ones that we have already generated by using the following script:

  bash getWeights.sh

In the modelweights folder there will be the three models:

decoder-210000.pkl
encoder-210000.pkl
vae-210000.pkl

Training

Note that training runs on a GPU and it will take several days to complete.

First construct a set of training molecules:

$ python prepare_data.py -i "./path/to/my/smiles.smi" -o "./path/to/my/smiles.npy"

Secondly, execute the training of a model:

$ python train.py -i "./path/to/my/smiles.npy" -o "./path/to/models"

Generation

Web based compund generation is available at https://playmolecule.org/LigDream/.

For an example of local novel compound generation please follow notebook generate.ipynb.

License

Code is released under GNU AFFERO GENERAL PUBLIC LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
compound_generation.py		compound_generation.py
decoding.py		decoding.py
generate.ipynb		generate.ipynb
generators.py		generators.py
getDataset.sh		getDataset.sh
getWeights.sh		getWeights.sh
networks.py		networks.py
prepare_data.py		prepare_data.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LigDream: Shape-Based Compound Generation

Citing

Requirements

Before starting

Training

Generation

License

About

Releases

Packages

Contributors 3

Languages

License

playmolecule/ligdream

Folders and files

Latest commit

History

Repository files navigation

LigDream: Shape-Based Compound Generation

Citing

Requirements

Before starting

Training

Generation

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages