Skip to content

Commit

Permalink
feat(docs): Add initial documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
kmaziarz committed Dec 11, 2023
1 parent 3635671 commit 565a999
Show file tree
Hide file tree
Showing 6 changed files with 173 additions and 65 deletions.
31 changes: 31 additions & 0 deletions .github/workflows/docs.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
name: Docs

on:
push:
branches: [ kmaziarz/docs ]
workflow_dispatch:

permissions:
contents: write

jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Configure Git Credentials
run: |
git config user.name github-actions[bot]
git config user.email 41898282+github-actions[bot]@users.noreply.github.com
- uses: actions/setup-python@v4
with:
python-version: 3.x
- run: echo "cache_id=$(date --utc '+%V')" >> $GITHUB_ENV
- uses: actions/cache@v3
with:
key: mkdocs-material-${{ env.cache_id }}
path: .cache
restore-keys: |
mkdocs-material-
- run: pip install mkdocs-material
- run: mkdocs gh-deploy --force
69 changes: 4 additions & 65 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,23 +17,11 @@ Syntheseus is a package for end-to-end retrosynthetic planning.
- ⚙️ Exposes a simple API to plug in custom models and algorithms
- 📈 Can be used to benchmark components of a retrosynthesis pipeline

## Setup
To learn about `syntheseus`'s features and API visit [microsoft.github.io/syntheseus](https://microsoft.github.io/syntheseus).

We support two installation modes:
- *core installation* allows you to build and benchmark your own models or search algorithms
- *full installation* additionally allows you to perform end-to-end search using some of the supported models
## Quick Start

For core installation (minimal dependencies, no ML libraries) run

```bash
# Create and activate a new conda environment (or use your own).
conda env create -f environment.yml
conda activate syntheseus

pip install -e .
```

For full installation (including all supported models and dependencies for visualization/development) run
To install `syntheseus` with all the extras, run

```bash
conda env create -f environment_full.yml
Expand All @@ -42,56 +30,7 @@ conda activate syntheseus-full
pip install -e ".[all]"
```

Both sets of instructions above assume you already cloned the repository via

```bash
git clone https://github.com/microsoft/syntheseus.git
cd syntheseus
```

Note that `environment_full.yml` pins the CUDA version (to 11.3) for reproducibility.
If you want to use a different one, make sure to edit the environment file accordingly.

Additionally, we also support GLN, but that requires a specialized environment and is thus not installed via `pip`. See [here](syntheseus/reaction_prediction/environment_gln/) for a Docker environment necessary for running GLN.

### Reducing the number of dependencies

To keep the environment smaller, you can replace the `all` option with a comma-separated subset of `{chemformer,local-retro,megan,mhn-react,retro-knn,root-aligned,viz,dev}` (`viz` and `dev` correspond to visualization and development dependencies, respectively).
For example, `pip install -e ".[local-retro,root-aligned]"` installs only LocalRetro and RootAligned.
If installing a subset of models, you can also delete the lines in `environment_full.yml` marked with names of models you do not wish to use.

Syntheseus contains two subpackages: `reaction_prediction`, which deals with benchmarking single-step reaction models, and `search`, which can use any single-step model to perform multi-step search.
Each is designed to have minimal dependencies, allowing it to run in a wide range of environments.
While specific components (single-step models, policies, or value functions) can make use of Deep Learning libraries, the core of `syntheseus` does not depend on any.

If you only want to use either of the two subpackages, you can limit the dependencies further by installing the dependencies separately and then running

```bash
pip install -e . --no-dependencies
```

See `pyproject.toml` for a list of dependencies tied to each subpackage.

### Model checkpoints

See table below for links to model checkpoints trained on USPTO-50K alongside with information on how these checkpoints were obtained.
Note that all checkpoints were produced in a way that involved external model repositories, hence may be affected by the exact license each model was released with.
For more details about a particular model see the top of the corresponding model wrapper file in `reaction_prediction/inference/`.


| Model checkpoint link | Source |
|----------------------------------------------------------------|--------|
| [Chemformer](https://figshare.com/ndownloader/files/42009888) | finetuned by us starting from checkpoint released by authors |
| [GLN](https://figshare.com/ndownloader/files/42012720) | released by authors |
| [LocalRetro](https://figshare.com/ndownloader/files/42012729) | trained by us |
| [MEGAN](https://figshare.com/ndownloader/files/42012732) | trained by us |
| [MHNreact](https://figshare.com/ndownloader/files/42012777) | trained by us |
| [RetroKNN](https://figshare.com/ndownloader/files/42012786) | trained by us |
| [RootAligned](https://figshare.com/ndownloader/files/42012792) | released by authors |

In `reaction_prediction/cli/eval.py` a forward model can be used for computing back-translation (round-trip) accuracy.
See [here](https://figshare.com/ndownloader/files/42012708) for a Chemformer checkpoint finetuned for forward prediction on USPTO-50K. As for the backward direction, pretrained weights released by original authors were used as a starting point.

See [documentation](https://microsoft.github.io/syntheseus/installation) if you prefer a more lightweight installation that only includes the parts you actually need.

## Development

Expand Down
19 changes: 19 additions & 0 deletions docs/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
<figure markdown>
![Image title](images/logo.png){width="450"}
<figcaption><h3>Navigating the labyrinth of synthesis planning</h3></figcaption>
</figure>

---

[![CI](https://github.com/microsoft/syntheseus/actions/workflows/ci.yml/badge.svg?branch=main)](https://github.com/microsoft/syntheseus/actions/workflows/ci.yml)
[![Python Version](https://img.shields.io/badge/python-3.7+-blue.svg)](https://www.python.org/downloads/)
[![code style](https://img.shields.io/badge/code%20style-black-202020.svg)](https://github.com/ambv/black)
[![License](https://img.shields.io/badge/license-MIT-green.svg)](https://github.com/microsoft/syntheseus/blob/main/LICENSE)

Syntheseus is a package for end-to-end retrosynthetic planning.

- ⚒️ Combines search algorithms and reaction models in a standardized way
- 🧭 Includes implementations of common search algorithms
- 🧪 Includes wrappers for state-of-the-art reaction models
- ⚙️ Exposes a simple API to plug in custom models and algorithms
- 📈 Can be used to benchmark components of a retrosynthesis pipeline
54 changes: 54 additions & 0 deletions docs/installation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
We support two installation modes:

- *core installation* allows you to build and benchmark your own models or search algorithms
- *full installation* also allows you to perform end-to-end search using the supported models

=== "Core installation"

```bash
conda env create -f environment.yml
conda activate syntheseus

pip install -e .
```

=== "Full installation"

```bash
conda env create -f environment_full.yml
conda activate syntheseus-full

pip install -e ".[all]"
```

Core installation includes only minimal dependencies (no ML libraries), while full installation includes all supported models and also dependencies for visualization/development.

Both sets of instructions above assume you already cloned the repository via

```bash
git clone https://github.com/microsoft/syntheseus.git
cd syntheseus
```

Note that `environment_full.yml` pins the CUDA version (to 11.3) for reproducibility.
If you want to use a different one, make sure to edit the environment file accordingly.

??? info "Setting up GLN"

We also support GLN, but it requires a specialized environment and is thus not installed via `pip`.
See [here](https://github.com/microsoft/syntheseus/blob/main/syntheseus/reaction_prediction/environment_gln/Dockerfile) for a Docker environment necessary for running GLN.

## Reducing the number of dependencies

To keep the environment smaller, you can replace the `all` option with a comma-separated subset of `{chemformer,local-retro,megan,mhn-react,retro-knn,root-aligned,viz,dev}` (`viz` and `dev` correspond to visualization and development dependencies, respectively).
For example, `pip install -e ".[local-retro,root-aligned]"` installs only LocalRetro and RootAligned.
If installing a subset of models, you can also delete the lines in `environment_full.yml` marked with names of models you do not wish to use.

If you only want to use a very specific part of `syntheseus`, you could also install it without dependencies:

```bash
pip install -e . --no-dependencies
```

You then would need to manually install a subset of dependencies that are required for a particular functionality you want to access.
See `pyproject.toml` for a list of dependencies tied to the `search` and `reaction_prediction` subpackages.
22 changes: 22 additions & 0 deletions docs/single_step.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
Syntheseus currently supports 7 established single-step models. For convenience, for each model we also include a checkpoint trained on USPTO-50K.

| Model checkpoint link | Source |
|----------------------------------------------------------------|--------|
| [Chemformer](https://figshare.com/ndownloader/files/42009888) | finetuned by us starting from checkpoint released by authors |
| [GLN](https://figshare.com/ndownloader/files/42012720) | released by authors |
| [LocalRetro](https://figshare.com/ndownloader/files/42012729) | trained by us |
| [MEGAN](https://figshare.com/ndownloader/files/42012732) | trained by us |
| [MHNreact](https://figshare.com/ndownloader/files/42012777) | trained by us |
| [RetroKNN](https://figshare.com/ndownloader/files/42012786) | trained by us |
| [RootAligned](https://figshare.com/ndownloader/files/42012792) | released by authors |

??? note "More advanced datasets"

The USPTO-50K dataset is well-established but relatively small. Advanced users may prefer to retrain their models of interest on a larger dataset, such as USPTO-FULL or Pistachio. To do that, please follow the instructions in the original model repositories.

In `reaction_prediction/cli/eval.py` a forward model can be used for computing back-translation (round-trip) accuracy.
See [here](https://figshare.com/ndownloader/files/42012708) for a Chemformer checkpoint finetuned for forward prediction on USPTO-50K. As for the backward direction, pretrained weights released by original authors were used as a starting point.

??? info "Licenses"
All checkpoints were produced in a way that involved external model repositories, hence may be affected by the exact license each model was released with.
For more details about a particular model see the top of the corresponding model wrapper file in `reaction_prediction/inference/`.
43 changes: 43 additions & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
site_name: Syntheseus

repo_name: microsoft/syntheseus
repo_url: https://github.com/microsoft/syntheseus
edit_uri: edit/main/docs/

theme:
name: material
palette:
- media: "(prefers-color-scheme: light)"
scheme: default
primary: black
accent: red
toggle:
icon: material/toggle-switch
name: "Switch to dark mode"
- media: "(prefers-color-scheme: dark)"
scheme: slate
primary: black
accent: red
toggle:
icon: material/toggle-switch-off-outline
name: "Switch to light mode"
features:
- content.code.copy
- navigation.tabs

nav:
- Get Started:
- Overview: index.md
- Installation: installation.md
- Single-Step Models: single_step.md
- CLI:
- Single-Step Evaluation: cli/eval_single_step.md

markdown_extensions:
- admonition
- attr_list
- md_in_html
- pymdownx.details
- pymdownx.superfences
- pymdownx.tabbed:
alternate_style: true

0 comments on commit 565a999

Please sign in to comment.