Releases
2.1.0
2.1.0 (2025-01-07)
Feat
prepare : outputs additional column signature_morgans
learning : add transformer code
dataset : add code to compute model tokens
dataset : add code for download and prepare datasets
transformer/train : additional arg for setting source / target max length
transformer/train : implement gradient accumulation
transformer/train : define num of data loader workers from args
transformer/train : make modele compilation by Torch optional
transformer/train : generalize mixed precision scaler usage
transformer/model : refine state_dict Module's method
transformer/train : check for NaNs in loss
transformer/train : model dir output as arg
transformer/train : experimentation with mixed precision floats
transformer/train : make use of pin_memory=true in dataloaders expected to increase GPU perf
transformer/train : first working version
transformer : in dev code
new code to download and make use of the signature code (#10 )
Fix
prepare : remove deprecated import
get_smiles : remove superflous Hs
prepare : sanitize molecule after stereo-isomer enumeration
prepare : add missing header
update changelog on version bump
attempt to trigger GA
main instead of master branch name
dataset : remove unused code
transformer/train : load_checkpoint
transformer/train : effective batch indexes
transformer/train : duplicated loss normalization
transformer/train : wrong arg name
transformer/train : take into account remaining remaining batches for the sceduler counts
transformer/train : propagate gradient for last batches of epoch
transformer/train : remove multiple calls to unscale_
transformer/train : use save_checkpoint
transformer/train : refine save and load methods
transformer/train : correct seq length arg
transformer/train : stop sending to preset device
dataset/utils.py : forward pass logger in recursive calls
tokenizer : allow additional depictions
Refactor
remove old code
.env : ignore local env file
erase old code
transformer : sweep code
dataset : clean deprecated code
transformer : remove deprecated code
transformer/train : refine gradient accumulation
transformer/config : reduce learning rate to prevent NaN / Inf values
transformer/train : make GPU pinned memory an option
transformer/train : add few debug messages
transformer/config : update
transformer/config : update
transformer/train : get the number of epochs from config
transformer/train : better log Nan / Inf value issues
transformer/config : increase learning rate
transformer/config : increase learning rate
transformer/config : reduce learning rate
transformer/train : update default log level
transformer/train : better handle device arg
transformer/config.yaml : update training values
model : remove unecessary code
dataset/utils.py : don't sort config keys
download : update paths
Perf
transformer/train : AdamW optimizer instead of Adam, OneCycleLR scheduler
You can’t perform that action at this time.