-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issues with BioThings BindingDB object fields #717
Comments
object.pubchem_cid
expand for detailed example
Example: https://biothings.ncats.io/bindingdb/query?q=object.pubchem_cid:4585
Other examples with basically the same behavior (notes here): |
object.inchikey
example 1
example 2
|
object.chemblSometimes the values seem to be concatenated strings of multiple IDs:
Examples:
|
object.name
For examples, click on any of the BioThings BindingDB links above, and look at the object.name value. |
And this is what I did to address Andy Crouse's original problem of "Nodes with no names from BioThings BindingDB": (pasted from biothings/pending.api#99 (comment))
|
@everaldorodrigo @newgene @andrewsu I think this would be a useful issue for @everaldorodrigo to dig into and address as much as possible. |
@colleenXu, considering the last released data from May 2024 to the CI environment, For those cases missing the The partial data below is extracted from the data source. There are two lines. The header and an example of item missing the field
Do you think we should use another field for the operations instead of |
Sorry for the late response. I think it's okay that some rows don't have a pubchem SID. I've been using the INCHIKEY instead (see the earlier post where I explain that I think it covered most of the resource and was somewhat reliable but still had a problem). |
(CC @newgene @erikyao for pending BioThings, @rjawesome as the original person who worked on the parser biothings/pending.api#70)
Andy Crouse from Translator's UI team pointed out that BTE was returning result chemical Nodes that didn't have names, with edges from BioThings BindingDB (Translator Slack link).
When investigating this, I discovered cases where object fields (the chemical), specifically ID and name ones, seem incorrect or problematic (see the comments). Some notes:
object.pubchem_sid
value is a "correct" ID (when the document has this field, which is >98% of documents or 1413131 / 1438909)The text was updated successfully, but these errors were encountered: