-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add uspto data from drfp #95
base: main
Are you sure you want to change the base?
Conversation
data/USPTO_500k/meta.yaml
Outdated
- https://bioportal.bioontology.org/ontologies/AFO?p=classes&conceptid=http%3A%2F%2Fpurl.allotrope.org%2Fontologies%2Fquality%23AFQ_0000227 | ||
- https://en.wikipedia.org/wiki/Yield_(chemistry) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would use the "id" in the ontology table, but I can show you at an example when we discuss
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kjappelbaum I'm not sure what you mean here?
Co-authored-by: Kevin M Jablonka <32935233+kjappelbaum@users.noreply.github.com>
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add benchmark field
link: https://tdcommons.ai/ | ||
split_column: split | ||
identifiers: | ||
- id: reaction_SMILES |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is a new entry for that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, great I see it. I will edit the file and PR it again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar comments as for the other PRs :)
Co-authored-by: Kevin M Jablonka <32935233+kjappelbaum@users.noreply.github.com>
I will add benchmark field on TDC version UPSTO
I will add benchmark field on TDC version UPSTO
I will add benchmark field on TDC version UPSTO
I split up the reaction, i.e., |
@MicPie, yes, I'd add reaction SMILES as this is the best hope to remove duplicates |
As I'm coming from the bio side, wouldn't we need to more info for a reaction smiles or is it always: |
I'm not sure what data is TDC yields. They are not very specific in their documentation: https://tdcommons.ai/single_pred_tasks/yields/#uspto On the other hand, I know the data from The Currently, things seem to be a bit mixed up in this pull request. |
You are right. There can be plenty of reactants, reagents, solvents, and catalysts leading to one or more products in a reaction SMILES. |
Add uspto raw from drfp until I finish uspto from tdc