Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pr/dessygil/40 #47

Merged
merged 14 commits into from
May 3, 2023
Merged

Pr/dessygil/40 #47

merged 14 commits into from
May 3, 2023

Conversation

maclandrol
Copy link
Member

Checklist:

  • Was this PR discussed in a issue? It is recommended to first discuss a new feature and let the community know whether you are planning or have started working on it before opening a PR.
  • Add tests to cover the fixed bug(s) or the new introduced feature(s) (if appropriate).
  • Update the API documentation if a new function is added or an existing one is deleted.
  • Added a news entry.
    • copy news/TEMPLATE.rst to news/my-feature-or-branch.rst) and edit it.

xref #40 pinging @dessygil

@maclandrol maclandrol mentioned this pull request May 3, 2023
4 tasks
@maclandrol
Copy link
Member Author

Fix #43

from molfeat.trans.pretrained.hf_transformers import PretrainedHFTransformer
import datamol as dm

smiles = dm.freesolv()["smiles"]
transformer = PretrainedHFTransformer("ChemBERTa-77M-MLM", notation="smiles", precompute_cache=True)
output = PretrainedHFTransformer.batch_transform(transformer, smiles, batch_size=128, concatenate=False)
len(transformer.precompute_cache) # should be len(smiles). 

Pretrained models should now work better with batch_transform, allowing efficient parallelization, while retaining all cached feature. PrecomputedMolTransformer molecule transformer should now be prefered WHEN you have existing cache already or are using static featurizers.

@maclandrol
Copy link
Member Author

I have also documented missing featurizers now:

from molfeat.store import ModelStore
store = ModelStore()
_, m = store.load("mordred")
print(m.usage())
import datamol as dm
from molfeat.trans import MoleculeTransformer
smiles = dm.freesolv().iloc[:50].smiles
# sanitize and standardize your molecules if needed
transformer = MoleculeTransformer(featurizer='mordred', dtype=float)
features = transformer(smiles)

ping @cwognum

@maclandrol maclandrol requested a review from cwognum May 3, 2023 14:52
Copy link
Contributor

@cwognum cwognum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not blocking, but any idea on how we could present this in a more aesthetically pleasing way? Maybe have them on the Molfeat website?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I will ping @mercuryseries for that.

@maclandrol maclandrol merged commit b27d512 into main May 3, 2023
@maclandrol maclandrol deleted the pr/dessygil/40 branch May 14, 2023 15:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants