- Essay list about Molecular Generation or Drug Discovery
- [Elsevier 2022] Deep learning approaches for de novo drug design: An overview [Paper]
- ChemBl Datasets
- PubChem
- PDBbind
- Cortellis Drug Discovery Intelligence
- ZINC15 database
- DrugBank
- GDB-13
- ANI-1
Enumeration of 166 Billion Organic Small Molecules in the Chemical Universe Database GDB-17
Lars Ruddigkeit, Ruud van Deursen, Lorenz C. Blum, Jean-Louis Reymond
Journal of Chemical Information and Modeling, November 2012, https://doi.org/f4d9mt
DOI: 10.1021/ci300415d · PMID: 23088335
The PDBbind Database: Methodologies and Updates
Renxiao Wang, Xueliang Fang, Yipin Lu, Chao-Yie Yang, Shaomeng Wang
Journal of Medicinal Chemistry, May 2005, https://doi.org/djbvfc
DOI: 10.1021/jm048957q · PMID: 15943484
970 Million Druglike Small Molecules for Virtual Screening in the Chemical Universe Database GDB-13
Lorenz C. Blum, Jean-Louis Reymond
Journal of the American Chemical Society, June 2009, https://doi.org/dwxj84
DOI: 10.1021/ja902302h · PMID: 19505099
ZINC 15 – Ligand Discovery for Everyone
Teague Sterling, John J. Irwin
Journal of Chemical Information and Modeling, November 2015, https://doi.org/gf4zg2
DOI: 10.1021/acs.jcim.5b00559 · PMID: 26479676 · PMCID: PMC4658288
ANI-1: A data set of 20M off-equilibrium DFT calculations for organic molecules
Justin S. Smith, Olexandr Isayev, Adrian E. Roitberg
arXiv, January 2018, https://arxiv.org/abs/1708.04987
DOI: 10.1038/sdata.2017.193 || code
The ANI-1 potential was shown to be chemically accurate for systems of 50 atoms and more, demonstrating extensibility and transferability to much larger molecules than those in the training set.” “This phenomenon, whereby an ML model is trained on small systems (which could be thought of as fragments of large systems), then demonstrated to be extensible to large systems has also been confirmed in other recent studies.
SidechainNet: An All-Atom Protein Structure Dataset for Machine Learning
Jonathan E. King, David Ryan Koes
arxiv || github::sidechainnet
TDC maintains a resource list that currently contains 22 tasks (and its datasets) related to small molecules and macromolecules, including PPI, DDI and so on. MoleculeNet published a small molecule related benchmark four years ago.
In terms of datasets and benchmarks, protein design is far less mature than drug discovery (paperwithcode drug discovery benchmarks). (Maybe should add the evaluation of protein design for deep learning method (especially deep generative model))
Difficulties and opportunities always coexist. Happy to see the work of Christian Dallago, Jody Mou, Kadina E. Johnston, Bruce J. Wittmann, Nicholas Bhattacharya, Samuel Goldman, Ali Madani, Kevin K. Yang and Zhangyang Gao, Cheng Tan, Stan Z. Li. How grateful.
0.3.1 Pymol
If you are a green hand for pymol, I will recommand you visiting this website PymolGallery, and it will be a very fantasitic instruction for you!
0.3.2 ChiemraX
0.3.3 VMD
0.3.4 Blender
0.3.5 Protein Imager
The Protein Imager: a full-featured online molecular viewer interface with server-side HQ-rendering capabilities
Gianluca Tomasello, Ilaria Armenia, Gianluca Molla
Bioinformatics, January 2020, https://doi.org/gqhbf2
DOI: 10.1093/bioinformatics/btaa009 · PMID: 31930403
Molecular Sets (MOSES): A Benchmarking Platform for Molecular Generation Models
Daniil Polykovskiy, Alexander Zhebrak, Benjamin Sanchez-Lengeling, Sergey Golovanov, Oktai Tatanov, Stanislav Belyaev, Rauf Kurbanov, Aleksey Artamonov, Vladimir Aladinskiy, Mark Veselov, … Alex Zhavoronkov
arXiv, October 2020, https://arxiv.org/abs/1811.12823 || pdf || code
GuacaMol: Benchmarking Models for de Novo Molecular Design
Nathan Brown, Marco Fiscato, Marwin H. S. Segler, Alain C. Vaucher
Journal of Chemical Information and Modeling, March 2019, https://doi.org/ggpn3x
DOI: 10.1021/acs.jcim.8b00839 · PMID: 30887799 || pdf || code
Inspired by YanZhe Zhang's papers_for_protein_design_using_DL, I have a tendency to organize drug discovery papers by deep learning published in recent years especially on molecular generation, and this repo in the future will always be dynamic.We will make this list by Manubot, If you know some literature in this regard, I also very welcome you to put forward the doi/url/arxiv/PMID and so on of the literature collected in this issue in the issue, On the other way, you can also contribute by create or edit the file in the content directory, as follows is for example:
## Manubot example documention and introduction link
url:https://greenelab.github.io/meta-review/
doi:10.1098/rsif.2017.0387
url:https://github.com/manubot/manubot/
In this repository, README.md
is created via continuous integration and should not be edited directly.
Edit README-BASE.md
to update this text.
Update the reference lists in the content
directory to add new sections or references.
This is only a proof of concept that is not robust against errors in the scripts or merge conflicts.
The deploy.sh
, and environment.yml
files were derived from https://github.com/manubot/rootstock (CC0 1.0 license).
If you are still confused with the markdown format about reasonable reference and In addition, this workflow only runs on issues with the label reference
.
Please See #7 for an example:)
sudo apt install pandoc-citeproc pandoc build-essential
pip install --upgrade git+https://github.com/manubot/manubot@$COMMIT
pip install panflute==1.12.5
GitHub - admislf/MINN-DTI: Effective drug-target interaction prediction with mutual interaction neural network
GitHub
https://github.com/admislf/MINN-DTI
An End-to-End Framework for Molecular Conformation Generation via Bilevel Programming
Minkai Xu, Wujie Wang, Shitong Luo, Chence Shi, Yoshua Bengio, Rafael Gomez-Bombarelli, Jian Tang
arXiv, June 2021, https://arxiv.org/abs/2105.07246
Learning Gradient Fields for Molecular Conformation Generation
Chence Shi, Shitong Luo, Minkai Xu, Jian Tang
arXiv, June 2021, https://arxiv.org/abs/2105.03902
GeoDiff: a Geometric Diffusion Model for Molecular Conformation Generation
Minkai Xu, Lantao Yu, Yang Song, Chence Shi, Stefano Ermon, Jian Tang
arXiv, March 2022, https://arxiv.org/abs/2203.02923
Deep Evolutionary Learning for Molecular Design
Karl Grantham, Muhetaer Mukaidaisi, Hsu Kiang Ooi, Mohammad Sajjad Ghaemi, Alain Tchagang, Yifeng Li
IEEE Computational Intelligence Magazine, May 2022, https://doi.org/gqdbrc
DOI: 10.1109/mci.2022.3155308
MGCVAE: Multi-Objective Inverse Design via Molecular Graph Conditional Variational Autoencoder
Myeonghun Lee, Kyoungmin Min
Journal of Chemical Information and Modeling, June 2022, https://doi.org/gqhf8q
DOI: 10.1021/acs.jcim.2c00487 · PMID: 35666276
3D Infomax improves GNNs for Molecular Property Prediction
Hannes Stärk, Dominique Beaini, Gabriele Corso, Prudencio Tossou, Christian Dallago, Stephan Günnemann, Pietro Liò
arXiv, June 2022, https://arxiv.org/abs/2110.04126
Pre-training Molecular Graph Representation with 3D Geometry
Shengchao Liu, Hanchen Wang, Weiyang Liu, Joan Lasenby, Hongyu Guo, Jian Tang
arXiv, May 2022, https://arxiv.org/abs/2110.07728 || pdf
EquiBind: Geometric Deep Learning for Drug Binding Structure Prediction
Hannes Stärk, Octavian-Eugen Ganea, Lagnajit Pattanaik, Regina Barzilay, Tommi Jaakkola
arXiv, June 2022, https://arxiv.org/abs/2202.05146 || pdf
Spherical Message Passing for 3D Graph Networks
Yi Liu, Limei Wang, Meng Liu, Xuan Zhang, Bora Oztekin, Shuiwang Ji
arXiv, May 2022, https://arxiv.org/abs/2102.05013 || pdf
A Deep Generative Model for Molecule Optimization via One Fragment Modification
Ziqi Chen, Martin Renqiang Min, Srinivasan Parthasarathy, Xia Ning
arXiv, January 2022, https://arxiv.org/abs/2012.04231
DOI: 10.1038/s42256-021-00410-2
GF-VAE
Changsheng Ma, Xiangliang Zhang
Proceedings of the 30th ACM International Conference on Information & Knowledge Management, October 2021, https://doi.org/gp2883
DOI: 10.1145/3459637.3482260 || code
Geometry-Based Molecular Generation With Deep Constrained Variational Autoencoder
Chunyan Li, Junfeng Yao, Wei Wei, Zhangming Niu, Xiangxiang Zeng, Jin Li, Jianmin Wang
IEEE Transactions on Neural Networks and Learning Systems, 2022, https://doi.org/gpjb8f
DOI: 10.1109/tnnls.2022.3147790 · PMID: 35171779 || code
Molecular visual representation based on 3D spatial structure: Referring to the extensive application of CNN in computer vision, we proposed a representation method of encoding molecular spatial structure into pictures, that is, converting molecular spatial coordinates into RGB attributes of pictures and using CNN for feature extraction. Then enter VAE model.
DeePKS+ABACUS as a Bridge between Expensive Quantum Mechanical Models and Machine Learning Potentials
Wenfei Li, Qi Ou, Yixiao Chen, Yu Cao, Renxi Liu, Chunyi Zhang, Daye Zheng, Chun Cai, Xifan Wu, Han Wang, … Linfeng Zhang
arXiv, June 2022, https://arxiv.org/abs/2206.10093
The Pre-main Sequence: Challenges and Prospects for Asteroseismology
Konstanze Zwintz, Thomas Steindl
arXiv, June 2022, https://arxiv.org/abs/2206.09171
DOI: 10.3389/fspas.2022.914738
LIMO: Latent Inceptionism for Targeted Molecule Generation
Peter Eckmann, Kunyang Sun, Bo Zhao, Mudong Feng, Michael K. Gilson, Rose Yu
arXiv, June 2022, https://arxiv.org/abs/2206.09010 || code || pdf
Attention-wise masked graph contrastive learning for predicting molecular property
Hui Liu, Yibiao Huang, Xuejun Liu, Lei Deng
arXiv, June 2022, https://arxiv.org/abs/2206.08262
Exploring Chemical Space with Score-based Out-of-distribution Generation
Seul Lee, Jaehyeong Jo, Sung Ju Hwang
arXiv, June 2022, https://arxiv.org/abs/2206.07632
A 3D Molecule Generative Model for Structure-Based Drug Design
Shitong Luo, Jiaqi Guan, Jianzhu Ma, Jian Peng
arXiv, March 2022, https://arxiv.org/abs/2203.10446 || pdf
Molecular Optimization by Capturing Chemist’s Intuition Using Deep Neural Networks November 2020, https://doi.org/gqgzp7
DOI: 10.21203/rs.3.rs-101137/v1
GraphNVP: An Invertible Flow Model for Generating Molecular Graphs
Kaushalya Madhawa, Katushiko Ishiguro, Kosuke Nakago, Motoki Abe
arXiv, May 2019, https://arxiv.org/abs/1905.11600
Graph Residual Flow for Molecular Graph Generation
Shion Honda, Hirotaka Akita, Katsuhiko Ishiguro, Toshiki Nakanishi, Kenta Oono
arXiv, October 2019, https://arxiv.org/abs/1909.13521
Junction Tree Variational Autoencoder for Molecular Graph Generation
Wengong Jin, Regina Barzilay, Tommi Jaakkola
arXiv, April 2019, https://arxiv.org/abs/1802.04364
Grammar Variational Autoencoder
Matt J. Kusner, Brooks Paige, José Miguel Hernández-Lobato
arXiv, March 2017, https://arxiv.org/abs/1703.01925
Syntax-Directed Variational Autoencoder for Structured Data
Hanjun Dai, Yingtao Tian, Bo Dai, Steven Skiena, Le Song
arXiv, February 2018, https://arxiv.org/abs/1802.08786
GraphVAE: Towards Generation of Small Graphs Using Variational Autoencoders
Martin Simonovsky, Nikos Komodakis
arXiv, February 2018, https://arxiv.org/abs/1802.03480
Scaffold-constrained molecular generation
Maxime Langevin, Herve Minoux, Maximilien Levesque, Marc Bianciotto
arXiv, January 2021, https://arxiv.org/abs/2009.07778
DOI: 10.1021/acs.jcim.0c01015
MolGPT: Molecular Generation Using a Transformer-Decoder Model
Viraj Bagal, Rishal Aggarwal, P. K. Vinod, U. Deva Priyakumar
Journal of Chemical Information and Modeling, October 2021, https://doi.org/gnw9m7
DOI: 10.1021/acs.jcim.1c00600 · PMID: 34694798
Hierarchical Generation of Molecular Graphs using Structural Motifs
Wengong Jin, Regina Barzilay, Tommi Jaakkola
arXiv, April 2020, https://arxiv.org/abs/2002.03230
An End-to-End Framework for Molecular Conformation Generation via Bilevel Programming
Minkai Xu, Wujie Wang, Shitong Luo, Chence Shi, Yoshua Bengio, Rafael Gomez-Bombarelli, Jian Tang
arXiv, June 2021, https://arxiv.org/abs/2105.07246
Geometry-Based Molecular Generation With Deep Constrained Variational Autoencoder
Chunyan Li, Junfeng Yao, Wei Wei, Zhangming Niu, Xiangxiang Zeng, Jin Li, Jianmin Wang
IEEE Transactions on Neural Networks and Learning Systems, 2022, https://doi.org/gpjb8f
DOI: 10.1109/tnnls.2022.3147790 · PMID: 35171779 || code
MGCVAE: Multi-Objective Inverse Design via Molecular Graph Conditional Variational Autoencoder
Myeonghun Lee, Kyoungmin Min
Journal of Chemical Information and Modeling, June 2022, https://doi.org/gqhf8q
DOI: 10.1021/acs.jcim.2c00487 · PMID: 35666276 || code || pdf
GF-VAE
Changsheng Ma, Xiangliang Zhang
Proceedings of the 30th ACM International Conference on Information & Knowledge Management, October 2021, https://doi.org/gp2883
DOI: 10.1145/3459637.3482260 || code
LIMO: Latent Inceptionism for Targeted Molecule Generation
Peter Eckmann, Kunyang Sun, Bo Zhao, Mudong Feng, Michael K. Gilson, Rose Yu
arXiv, June 2022, https://arxiv.org/abs/2206.09010 || code || pdf
Junction Tree Variational Autoencoder for Molecular Graph Generation
Wengong Jin, Regina Barzilay, Tommi Jaakkola
arXiv, April 2019, https://arxiv.org/abs/1802.04364
Grammar Variational Autoencoder
Matt J. Kusner, Brooks Paige, José Miguel Hernández-Lobato
arXiv, March 2017, https://arxiv.org/abs/1703.01925
Syntax-Directed Variational Autoencoder for Structured Data
Hanjun Dai, Yingtao Tian, Bo Dai, Steven Skiena, Le Song
arXiv, February 2018, https://arxiv.org/abs/1802.08786
GraphVAE: Towards Generation of Small Graphs Using Variational Autoencoders
Martin Simonovsky, Nikos Komodakis
arXiv, February 2018, https://arxiv.org/abs/1802.03480
Hierarchical Generation of Molecular Graphs using Structural Motifs
Wengong Jin, Regina Barzilay, Tommi Jaakkola
arXiv, April 2020, https://arxiv.org/abs/2002.03230
A 3D Molecule Generative Model for Structure-Based Drug Design
Shitong Luo, Jiaqi Guan, Jianzhu Ma, Jian Peng
arXiv, March 2022, https://arxiv.org/abs/2203.10446 || pdf || code
MolGPT: Molecular Generation Using a Transformer-Decoder Model
Viraj Bagal, Rishal Aggarwal, P. K. Vinod, U. Deva Priyakumar
Journal of Chemical Information and Modeling, October 2021, https://doi.org/gnw9m7
DOI: 10.1021/acs.jcim.1c00600 · PMID: 34694798 || code
Molecular Optimization by Capturing Chemist’s Intuition Using Deep Neural Networks November 2020, https://doi.org/gqgzp7
DOI: 10.21203/rs.3.rs-101137/v1 || code
GraphNVP: An Invertible Flow Model for Generating Molecular Graphs
Kaushalya Madhawa, Katushiko Ishiguro, Kosuke Nakago, Motoki Abe
arXiv, May 2019, https://arxiv.org/abs/1905.11600
Graph Residual Flow for Molecular Graph Generation
Shion Honda, Hirotaka Akita, Katsuhiko Ishiguro, Toshiki Nakanishi, Kenta Oono
arXiv, October 2019, https://arxiv.org/abs/1909.13521
Shape-Based Generative Modeling for de Novo Drug Design
Miha Skalic, José Jiménez, Davide Sabbadin, Gianni De Fabritiis
Journal of Chemical Information and Modeling, February 2019, https://doi.org/gfv7f3
DOI: 10.1021/acs.jcim.8b00706 · PMID: 30762364 || code
Deep Generative Models for 3D Linker Design
Fergus Imrie, Anthony R. Bradley, Mihaela van der Schaar, Charlotte M. Deane
Journal of Chemical Information and Modeling, March 2020, https://doi.org/gnfhsq
DOI: 10.1021/acs.jcim.9b01120 · PMID: 32195587 · PMCID: PMC7189367 || code
Geometry-Based Molecular Generation With Deep Constrained Variational Autoencoder
Chunyan Li, Junfeng Yao, Wei Wei, Zhangming Niu, Xiangxiang Zeng, Jin Li, Jianmin Wang
IEEE Transactions on Neural Networks and Learning Systems, 2022, https://doi.org/gpjb8f
DOI: 10.1109/tnnls.2022.3147790 · PMID: 35171779 || code
GeoDiff: a Geometric Diffusion Model for Molecular Conformation Generation
Minkai Xu, Lantao Yu, Yang Song, Chence Shi, Stefano Ermon, Jian Tang
arXiv, March 2022, https://arxiv.org/abs/2203.02923
Scaffold-constrained molecular generation
Maxime Langevin, Herve Minoux, Maximilien Levesque, Marc Bianciotto
arXiv, January 2021, https://arxiv.org/abs/2009.07778
DOI: 10.1021/acs.jcim.0c01015
If you want to learn more about protein design paper, recommand you visit papers_for_protein_design_using_DL
Machine-learning-guided directed evolution for protein engineering
Kevin K. Yang, Zachary Wu, Frances H. Arnold
Nature Methods, July 2019, https://doi.org/gf43h4
DOI: 10.1038/s41592-019-0496-6 · PMID: 31308553
Batched Stochastic Bayesian Optimization via Combinatorial Constraints Design
Kevin K. Yang, Yuxin Chen, Alycia Lee, Yisong Yue
arXiv, April 2019, https://arxiv.org/abs/1904.08102
Unified rational protein engineering with sequence-only deep representation learning
Ethan C. Alley, Grigory Khimulya, Surojit Biswas, Mohammed AlQuraishi, George M. Church
Cold Spring Harbor Laboratory, March 2019, https://doi.org/gf48g2
DOI: 10.1101/589333
Navigating the protein fitness landscape with Gaussian processes
Philip A. Romero, Andreas Krause, Frances H. Arnold
Proceedings of the National Academy of Sciences, December 2012, https://doi.org/f4k8bz
DOI: 10.1073/pnas.1215251110 · PMID: 23277561 · PMCID: PMC3549130
A comparison of single-cell trajectory inference methods
Wouter Saelens, Robrecht Cannoodt, Helena Todorov, Yvan Saeys
Nature Biotechnology, April 2019, https://doi.org/gfxsgd
DOI: 10.1038/s41587-019-0071-9 · PMID: 30936559
GitHub - agitter/single-cell-pseudotime: An overview of algorithms for estimating pseudotime in single-cell RNA-seq data
GitHub
https://github.com/agitter/single-cell-pseudotime
Network Inference with Granger Causality Ensembles on Single-Cell Transcriptomic Data
Atul Deshpande, Li-Fang Chu, Ron Stewart, Anthony Gitter
Cold Spring Harbor Laboratory, January 2019, https://doi.org/gft4bb
DOI: 10.1101/534834