To install the environment with plenty of uneccessary packages (the output of conda env export --no-build --name retrogfn
), run:
conda env create -f environment_full.yaml
To install the minial set of packages extracted with pipreqs
, run:
conda create -n python=3.11.4 retrogfn pip
conda activate retrogfn
pip install -r requirements.txt
Checkpoints can be downloaded from here and should be placed in the checkpoints
directory.
See notebooks/example.ipynb
.
To prepare the training datasets, run the following notebooks under notebooks/created_dataset
:
create_positive.ipynb
. It removes the atom mapping from the raw USPTO dataset. We call this dataset "positive".extract_forward_templates.ipynb
. It extracts the forward templates from the USPTO dataset.create_negative_forward.ipynb
. It creates the negative reactions by applying the forward templates to reactants from the positive dataset.create_negative_shuffle.ipynb
. It creates the negative reactions by shuffling the reactants from the positive dataset. A product from a positive dataset is assigned with a reactants coming from a similar (in terms of Tanimoto distance) reaction.merge_files.ipynb
. It merges the positive and negative datasets into a single one.