Skip to content

Commit

Permalink
Add initial documentation (#45)
Browse files Browse the repository at this point in the history
This PR adds an initial static documentation built using Material for
MkDocs. The docs are built automatically using GitHub Actions and
deployed to microsoft.github.io/syntheseus/.

This first version is a combination of content we had in `README.md`
(which is changed here to be very minimalistic, linking to the docs for
further information) and the pre-existing
`docs/cli/eval_single_step.md`. In future PRs we can extend the docs
with documentation for the search CLI, as well as an API reference and
tutorials.
  • Loading branch information
kmaziarz authored Dec 12, 2023
1 parent 4954c7f commit 5ed4907
Show file tree
Hide file tree
Showing 8 changed files with 174 additions and 66 deletions.
31 changes: 31 additions & 0 deletions .github/workflows/docs.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
name: Docs

on:
push:
branches: [ main ]
workflow_dispatch:

permissions:
contents: write

jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Configure Git Credentials
run: |
git config user.name github-actions[bot]
git config user.email 41898282+github-actions[bot]@users.noreply.github.com
- uses: actions/setup-python@v4
with:
python-version: 3.x
- run: echo "cache_id=$(date --utc '+%V')" >> $GITHUB_ENV
- uses: actions/cache@v3
with:
key: mkdocs-material-${{ env.cache_id }}
path: .cache
restore-keys: |
mkdocs-material-
- run: pip install mkdocs-material
- run: mkdocs gh-deploy --force
71 changes: 5 additions & 66 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
<div align="center">
<img src="images/logo.png" height="50px">
<img src="docs/images/logo.png" height="50px">
<h3><i>Navigating the labyrinth of synthesis planning</i></h3>
</div>

Expand All @@ -17,23 +17,11 @@ Syntheseus is a package for end-to-end retrosynthetic planning.
- ⚙️ Exposes a simple API to plug in custom models and algorithms
- 📈 Can be used to benchmark components of a retrosynthesis pipeline

## Setup
To learn about `syntheseus`'s features and API visit [microsoft.github.io/syntheseus](https://microsoft.github.io/syntheseus).

We support two installation modes:
- *core installation* allows you to build and benchmark your own models or search algorithms
- *full installation* additionally allows you to perform end-to-end search using some of the supported models
## Quick Start

For core installation (minimal dependencies, no ML libraries) run

```bash
# Create and activate a new conda environment (or use your own).
conda env create -f environment.yml
conda activate syntheseus

pip install -e .
```

For full installation (including all supported models and dependencies for visualization/development) run
To install `syntheseus` with all the extras, run

```bash
conda env create -f environment_full.yml
Expand All @@ -42,56 +30,7 @@ conda activate syntheseus-full
pip install -e ".[all]"
```

Both sets of instructions above assume you already cloned the repository via

```bash
git clone https://github.com/microsoft/syntheseus.git
cd syntheseus
```

Note that `environment_full.yml` pins the CUDA version (to 11.3) for reproducibility.
If you want to use a different one, make sure to edit the environment file accordingly.

Additionally, we also support GLN, but that requires a specialized environment and is thus not installed via `pip`. See [here](syntheseus/reaction_prediction/environment_gln/) for a Docker environment necessary for running GLN.

### Reducing the number of dependencies

To keep the environment smaller, you can replace the `all` option with a comma-separated subset of `{chemformer,local-retro,megan,mhn-react,retro-knn,root-aligned,viz,dev}` (`viz` and `dev` correspond to visualization and development dependencies, respectively).
For example, `pip install -e ".[local-retro,root-aligned]"` installs only LocalRetro and RootAligned.
If installing a subset of models, you can also delete the lines in `environment_full.yml` marked with names of models you do not wish to use.

Syntheseus contains two subpackages: `reaction_prediction`, which deals with benchmarking single-step reaction models, and `search`, which can use any single-step model to perform multi-step search.
Each is designed to have minimal dependencies, allowing it to run in a wide range of environments.
While specific components (single-step models, policies, or value functions) can make use of Deep Learning libraries, the core of `syntheseus` does not depend on any.

If you only want to use either of the two subpackages, you can limit the dependencies further by installing the dependencies separately and then running

```bash
pip install -e . --no-dependencies
```

See `pyproject.toml` for a list of dependencies tied to each subpackage.

### Model checkpoints

See table below for links to model checkpoints trained on USPTO-50K alongside with information on how these checkpoints were obtained.
Note that all checkpoints were produced in a way that involved external model repositories, hence may be affected by the exact license each model was released with.
For more details about a particular model see the top of the corresponding model wrapper file in `reaction_prediction/inference/`.


| Model checkpoint link | Source |
|----------------------------------------------------------------|--------|
| [Chemformer](https://figshare.com/ndownloader/files/42009888) | finetuned by us starting from checkpoint released by authors |
| [GLN](https://figshare.com/ndownloader/files/42012720) | released by authors |
| [LocalRetro](https://figshare.com/ndownloader/files/42012729) | trained by us |
| [MEGAN](https://figshare.com/ndownloader/files/42012732) | trained by us |
| [MHNreact](https://figshare.com/ndownloader/files/42012777) | trained by us |
| [RetroKNN](https://figshare.com/ndownloader/files/42012786) | trained by us |
| [RootAligned](https://figshare.com/ndownloader/files/42012792) | released by authors |

In `reaction_prediction/cli/eval.py` a forward model can be used for computing back-translation (round-trip) accuracy.
See [here](https://figshare.com/ndownloader/files/42012708) for a Chemformer checkpoint finetuned for forward prediction on USPTO-50K. As for the backward direction, pretrained weights released by original authors were used as a starting point.

See [documentation](https://microsoft.github.io/syntheseus/installation) if you prefer a more lightweight installation that only includes the parts you actually need.

## Development

Expand Down
File renamed without changes
File renamed without changes
19 changes: 19 additions & 0 deletions docs/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
<figure markdown>
![Image title](images/logo.png){width="450"}
<figcaption><h3>Navigating the labyrinth of synthesis planning</h3></figcaption>
</figure>

---

[![CI](https://github.com/microsoft/syntheseus/actions/workflows/ci.yml/badge.svg?branch=main)](https://github.com/microsoft/syntheseus/actions/workflows/ci.yml)
[![Python Version](https://img.shields.io/badge/python-3.7+-blue.svg)](https://www.python.org/downloads/)
[![code style](https://img.shields.io/badge/code%20style-black-202020.svg)](https://github.com/ambv/black)
[![License](https://img.shields.io/badge/license-MIT-green.svg)](https://github.com/microsoft/syntheseus/blob/main/LICENSE)

Syntheseus is a package for end-to-end retrosynthetic planning.

- ⚒️ Combines search algorithms and reaction models in a standardized way
- 🧭 Includes implementations of common search algorithms
- 🧪 Includes wrappers for state-of-the-art reaction models
- ⚙️ Exposes a simple API to plug in custom models and algorithms
- 📈 Can be used to benchmark components of a retrosynthesis pipeline
54 changes: 54 additions & 0 deletions docs/installation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
We support two installation modes:

- *core installation* allows you to build and benchmark your own models or search algorithms
- *full installation* also allows you to perform end-to-end search using the supported models

=== "Core installation"

```bash
conda env create -f environment.yml
conda activate syntheseus

pip install -e .
```

=== "Full installation"

```bash
conda env create -f environment_full.yml
conda activate syntheseus-full

pip install -e ".[all]"
```

Core installation includes only minimal dependencies (no ML libraries), while full installation includes all supported models and also dependencies for visualization/development.

Instructions above assume you already cloned the repository via

```bash
git clone https://github.com/microsoft/syntheseus.git
cd syntheseus
```

Note that `environment_full.yml` pins the CUDA version (to 11.3) for reproducibility.
If you want to use a different one, make sure to edit the environment file accordingly.

??? info "Setting up GLN"

We also support GLN, but it requires a specialized environment and is thus not installed via `pip`.
See [here](https://github.com/microsoft/syntheseus/blob/main/syntheseus/reaction_prediction/environment_gln/Dockerfile) for a Docker environment necessary for running GLN.

## Reducing the number of dependencies

To keep the environment smaller, you can replace the `all` option with a comma-separated subset of `{chemformer,local-retro,megan,mhn-react,retro-knn,root-aligned,viz,dev}` (`viz` and `dev` correspond to visualization and development dependencies, respectively).
For example, `pip install -e ".[local-retro,root-aligned]"` installs only LocalRetro and RootAligned.
If installing a subset of models, you can also delete the lines in `environment_full.yml` marked with names of models you do not wish to use.

If you only want to use a very specific part of `syntheseus`, you could also install it without dependencies:

```bash
pip install -e . --no-dependencies
```

You then would need to manually install a subset of dependencies that are required for a particular functionality you want to access.
See `pyproject.toml` for a list of dependencies tied to the `search` and `reaction_prediction` subpackages.
22 changes: 22 additions & 0 deletions docs/single_step.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
Syntheseus currently supports 7 established single-step models. For convenience, for each model we also include a checkpoint trained on USPTO-50K.

| Model checkpoint link | Source |
|----------------------------------------------------------------|--------|
| [Chemformer](https://figshare.com/ndownloader/files/42009888) | finetuned by us starting from checkpoint released by authors |
| [GLN](https://figshare.com/ndownloader/files/42012720) | released by authors |
| [LocalRetro](https://figshare.com/ndownloader/files/42012729) | trained by us |
| [MEGAN](https://figshare.com/ndownloader/files/42012732) | trained by us |
| [MHNreact](https://figshare.com/ndownloader/files/42012777) | trained by us |
| [RetroKNN](https://figshare.com/ndownloader/files/42012786) | trained by us |
| [RootAligned](https://figshare.com/ndownloader/files/42012792) | released by authors |

??? note "More advanced datasets"

The USPTO-50K dataset is well-established but relatively small. Advanced users may prefer to retrain their models of interest on a larger dataset, such as USPTO-FULL or Pistachio. To do that, please follow the instructions in the original model repositories.

In `reaction_prediction/cli/eval.py` a forward model can be used for computing back-translation (round-trip) accuracy.
See [here](https://figshare.com/ndownloader/files/42012708) for a Chemformer checkpoint finetuned for forward prediction on USPTO-50K. As for the backward direction, pretrained weights released by original authors were used as a starting point.

??? info "Licenses"
All checkpoints were produced in a way that involved external model repositories, hence may be affected by the exact license each model was released with.
For more details about a particular model see the top of the corresponding model wrapper file in `reaction_prediction/inference/`.
43 changes: 43 additions & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
site_name: Syntheseus

repo_name: microsoft/syntheseus
repo_url: https://github.com/microsoft/syntheseus
edit_uri: edit/main/docs/

theme:
name: material
palette:
- media: "(prefers-color-scheme: light)"
scheme: default
primary: black
accent: red
toggle:
icon: material/toggle-switch
name: "Switch to dark mode"
- media: "(prefers-color-scheme: dark)"
scheme: slate
primary: black
accent: red
toggle:
icon: material/toggle-switch-off-outline
name: "Switch to light mode"
features:
- content.code.copy
- navigation.tabs

nav:
- Get Started:
- Overview: index.md
- Installation: installation.md
- Single-Step Models: single_step.md
- CLI:
- Single-Step Evaluation: cli/eval_single_step.md

markdown_extensions:
- admonition
- attr_list
- md_in_html
- pymdownx.details
- pymdownx.superfences
- pymdownx.tabbed:
alternate_style: true

0 comments on commit 5ed4907

Please sign in to comment.