Releases · brsynth/molecule-signature-paper

31 Jan 15:02

github-actions

3.0.0

13ff7ed

Release 3.0.0 Latest

Latest

3.0.0 (2025-01-31)

Feat

predict: predict call from CLI now always outputs something
2.enumeration_results: notebook to perform the molecule enumeration
1.enumeration_create_alphabets: notebook to create alphabets
notebooks: add notebook
configure: add interface for simple configuration
predict: add interface for predicting (without evaluation)
model: fully batch-vectorized version for beam search
predict: test for equality of canonic SMILES
predict: decode beam in parallel

Fix

predict: convert python objects to strings
predict: dataclass attribute poorly tested
predict: remove pickle writing
evaluate: add output file handling to write results
evaluate: correctly deal with the max number of rows
evaluate: include chirality in ECFP
predict: stop crashing when beam size > vocab size
predict: allow selection of the accelerator device
predict: column indexes

Refactor

predict: update default arg values
predict: update result refinements
configure: add default output path to None
config: remove unused method
imports: refine
add empty data folder
notebooks: merge cells
notebooks: rename nb
notebooks: rename notebook for fig 2
utils: move utilities functions
evaluate: remove file
predict: make col names more explicit
evaluate: sweep code
utils: sweep code
predict: refine outputs
predict: remove unused args
utils: additional shared functions
predict: delagate results refining to subsequent code
predict: allow calls from other script
predict: better print
predict: improve imports
predict: print result to stdout on request
predict: print default values

Assets 2

07 Jan 12:30

github-actions

2.1.0

2acbbcf

Release 2.1.0

2.1.0 (2025-01-07)

Feat

prepare: outputs additional column signature_morgans
learning: add transformer code
dataset: add code to compute model tokens
dataset: add code for download and prepare datasets
transformer/train: additional arg for setting source / target max length
transformer/train: implement gradient accumulation
transformer/train: define num of data loader workers from args
transformer/train: make modele compilation by Torch optional
transformer/train: generalize mixed precision scaler usage
transformer/model: refine state_dict Module's method
transformer/train: check for NaNs in loss
transformer/train: model dir output as arg
transformer/train: experimentation with mixed precision floats
transformer/train: make use of pin_memory=true in dataloaders expected to increase GPU perf
transformer/train: first working version
transformer: in dev code
new code to download and make use of the signature code (#10)

Fix

prepare: remove deprecated import
get_smiles: remove superflous Hs
prepare: sanitize molecule after stereo-isomer enumeration
prepare: add missing header
update changelog on version bump
attempt to trigger GA
main instead of master branch name
dataset: remove unused code
transformer/train: load_checkpoint
transformer/train: effective batch indexes
transformer/train: duplicated loss normalization
transformer/train: wrong arg name
transformer/train: take into account remaining remaining batches for the sceduler counts
transformer/train: propagate gradient for last batches of epoch
transformer/train: remove multiple calls to unscale_
transformer/train: use save_checkpoint
transformer/train: refine save and load methods
transformer/train: correct seq length arg
transformer/train: stop sending to preset device
dataset/utils.py: forward pass logger in recursive calls
tokenizer: allow additional depictions

Refactor

remove old code
.env: ignore local env file
erase old code
transformer: sweep code
dataset: clean deprecated code
transformer: remove deprecated code
transformer/train: refine gradient accumulation
transformer/config: reduce learning rate to prevent NaN / Inf values
transformer/train: make GPU pinned memory an option
transformer/train: add few debug messages
transformer/config: update
transformer/config: update
transformer/train: get the number of epochs from config
transformer/train: better log Nan / Inf value issues
transformer/config: increase learning rate
transformer/config: increase learning rate
transformer/config: reduce learning rate
transformer/train: update default log level
transformer/train: better handle device arg
transformer/config.yaml: update training values
model: remove unecessary code
dataset/utils.py: don't sort config keys
download: update paths

Perf

transformer/train: AdamW optimizer instead of Adam, OneCycleLR scheduler

Assets 2

30 Oct 15:07

github-actions

1.1.0

8a75508

Release 1.1.0

1.1.0 (2023-10-30)

Features

download_metanetx: generate sig alphabet with nbit and neighbors (8b749d6)
library: update to RevSig1.5 (8de9a0d)
paper: construct alphabet for sig-nbit (866437d)
paper: download, add emolecules (093fcfe)
paper: download, add FP count and extract test_small (9987ee8)
paper: download, enable formalCharge in sanitize (d4c66e3)
paper: enable sig-nbit (4f2c125)
paper: img, add (9469384)
paper: img, add degenerescence (d2e6730)
paper: tokenizer, use ECFP4_COUNT (9e31b56)
tokenize: write SIG-NEIGH-NBIT datasets (f84dc93)
tokenizer: increase script verbosity (0bfe1ab)
tokenizer: new arguments to select tokenizer model, depic to treat and pairs to build (19d03e3)
tokenizer: produce SIG-NEIGH-NBIT datasets (daf0454)
tokenizer: refactor and enable unigram model type (45400fe)
tokenizer: use all tokens available and support unigram model (de449e2)

Bug Fixes

download_metanetx: fix paths (0c44e9b)
download_metanetx: fix paths (1febe59)
paper: dataset, ecfp4 duplicate index number according to the count (d7abd1e)
paper: tokenizer, use the right function (73c9f21)
signature: use ECFP instead of FCFP (97972fb)
tokenizer: fix regular expression (7e9bcdc)
tokenizer: spelling in AROMATIC bond regex (624d678)
tokenizer: stop omitting bounds in regex (0398217)
tokenizer: stop spliting SIG bond tokens (0f004cd)

Code Refactoring

download_metanetx: print settings (3921c6c)
download_metanetx: progress bar and more logs (14fe1fe)
download_metanetx: store file paths in args.dict (a898d9c)

Build Systems

.gitignore: update (c20539b)
.gitignore: update (082bf63)

Documentation

readme: update (b4b76ab)
README: update (155ecc5)
README: update (d9908bc)
README: update (1991a29)

Styles

tokenizer: black file (c33f3b4)
indent comments (782673d)
download_metanetx: add comments (095692a)
download_metanetx: more explicit argparse help (15aa802)
tokenizer: fix flake8 warnings (9c1c105)

Assets 2

09 Aug 09:55

github-actions

1.0.0

6d8c299

Release 1.0.0

1.0.0 (2023-08-09)

⚠ BREAKING CHANGES

tokenizer:

Features

download: introduce default output dir (09765ec)
library: update with "RevSig1.2" (1608cd7)
paper: add tokenizer signature (545cfa6)
retrosig: add utils/cmd.py file (8f398db)
tokenizer: add sentencepiece tokenizer (3ee9f5c)
tokenizer: build vocabularies and dataset pairs (f4ae35d)
tokenizer: only output on-bits in ECFP4 (62f9dca)

Bug Fixes

download: create ouput dir if it not exists (6679734)
download: fix argparse crash due to percent sign in help (#6) (2db597e)
download: prevent removing raw mnx file (f3c2d85)
download: put back right path for rdkit method (#7) (1402a45)
download: shuffle data only once (f51742f)
tokenizer: fingerprints name in upper case to match expectation (256021a)

Build Systems

add tox file (e12beeb)

Code Refactoring

download: change default value of test and valid datasets (6ceb79a)
download: disable shuffling before sanitizing (0934919)
download: pointing out unexpected filtered smiles (6c99f96)
download: update ouput name for the signature alphabet file (a2ef08c)
sweep imports (0a49399)
download: simplify args usage (90bf3f5)
tokenizer: change file pairs extension (576139c)

Styles

download: rename variables (bb88d89)
download: sweep imports (de34f44)
download: sweep imports (c6313dd)
download: update helps of arguments (82a5afb)
blacked files (856f0fc)

Documentation

download: make fingerprint size explicit (024281f)
README: update (4a31529)
README: update (fe73fc0)
README: update install instructions (389bfc4)

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

3.0.0 (2025-01-31)

Feat

Fix

Refactor

2.1.0 (2025-01-07)

Feat

Fix

Refactor

Perf

1.1.0 (2023-10-30)

Features

Bug Fixes

Code Refactoring

Build Systems

Documentation

Styles

1.0.0 (2023-08-09)

⚠ BREAKING CHANGES

Features

Bug Fixes

Build Systems

Code Refactoring

Styles

Documentation

Releases: brsynth/molecule-signature-paper

Release 3.0.0

3.0.0 (2025-01-31)

Feat

Fix

Refactor

Release 2.1.0

2.1.0 (2025-01-07)

Feat

Fix

Refactor

Perf

Release 1.1.0

1.1.0 (2023-10-30)

Features

Bug Fixes

Code Refactoring

Build Systems

Documentation

Styles

Release 1.0.0

1.0.0 (2023-08-09)

⚠ BREAKING CHANGES

Features

Bug Fixes

Build Systems

Code Refactoring

Styles

Documentation