Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

something strange with the UMLS Identifier dimethyl sulfoxide UMLS:DC0012403 #836

Open
sstemann opened this issue Jul 5, 2024 · 3 comments

Comments

@sstemann
Copy link

sstemann commented Jul 5, 2024

I'm not sure if this is a BTE issue or not but when I run the query, MVP1 What may treat Bethlem Myopathy, i get dimethyl sulfoxide twice, in in the UI

https://ui.test.transltr.io/main/results?l=Bethlem%20Myopathy&i=MONDO:0008029&t=0&r=0&q=a04bff0e-1b72-494c-ae67-6f1fbc7aa66e

image

Both say BTE

In the ARAX GUI > BTE > Result 40 shows dimethyl sulfoxide with CURIE UMLS:DC0012403, which I cannot find in the UMLS Metathesaurus

image

If I drop the D and search for UMLS:C0012403, I get dimethyl sulfoxide
image

Result 94 is CHEBI:28262 which seems normal
image

I'm sending to BTE first, I'm not sure if its really an NN issue given the CURIE looks non-existent

@andrewsu
Copy link

andrewsu commented Jul 5, 2024

This partly has to do with us ingesting SuppKG, a resource that links dietary supplements to diseases. More details can be found in biothings/pending.api#55 and biothings/biothings_explorer#706, but the TLDR is that the resource invented these UMLS-like identifiers in cases where they didn't find a suitable UMLS identifier that already existed. We (internally) discussed the issue when we brought SuppKG on board and decided to just move forward since these are a relatively small proportion of suppKG as a whole, but we are certainly open to revisiting...

There is also something strange with the lack of EPC here, related to #831. We'll be looking into that one as well...

@sstemann
Copy link
Author

sstemann commented Jul 8, 2024

in this Bethlem Myopathy query it looks like there's ~44 results with identifiers "UMLS:DC#######". On the plus side, these arent returned on the first page, but they are getting a sugeno .46 (0-1). On the other hand, some do have real CURIEs (Hawthorn Plant in UMLS is C1527346, Earthworms are C0086194). It seems odd to return results that do not have valid CURIES.

image image
subjectNode_name subjectNode_id
hawthorn UMLS:DC1621401
earthworm UMLS:DC1621389
4-hydroxyphenyl UMLS:DC0912024
2-phenyl-benzopyrans UMLS:DC0596577
ch'ih shen UMLS:DC0377336
bac ngu vi tu UMLS:DC0141729
bingpian UMLS:DC0106916
apple polyphenol extract UMLS:DC0071649
novasoy phytoestrogen extract UMLS:DC0071011
2-hydroxy-4-methoxyacetophenone UMLS:DC0069939
heparinoid UMLS:DC0066923
acetate de d-alpha tocopheryle UMLS:DC0042874
& vitamin palmitate UMLS:DC0042839
n-octadecanoic acid UMLS:DC0038229
atomic number 11 UMLS:DC0037473
5-alpha-furost-20-en-12-one-3 beta, 26-diol UMLS:DC0036189
beta-d-ribofuranose UMLS:DC0035549
b-2 UMLS:DC0035527
root UMLS:DC0035509
epoprostanol UMLS:DC0033567
fibersol-2 UMLS:DC0032594
24-beta-ethyl-delta-5-cholesten-3beta-ol UMLS:DC0031866
activator UMLS:DC0031610
flaxseed UMLS:DC0023753
hesperidin methyl chalcone UMLS:DC0019392
root UMLS:DC0017987
bilberry fruit UMLS:DC0016767
5-mthf UMLS:DC0016410
1,200 mg UMLS:DC0016157
additional omega-3 essential fatty acids UMLS:DC0015689
acides gras cetylated UMLS:DC0015684
eugenol UMLS:DC0015153
acide docosahexaenoique UMLS:DC0012968
dimethyl sulfoxide UMLS:DC0012403
fiber UMLS:DC0012173
bill henderson protocol UMLS:DC0012155
beta-sitosterol-beta-d-glycoside UMLS:DC0007158
acide butyrique UMLS:DC0006523
berberina UMLS:DC0005117
acide l-ascorbique, 6-palmitate UMLS:DC0003968
anthocyanin UMLS:DC0003161
alkaloid, nos UMLS:DC0002062
acide gras essentiel UMLS:DC0000545

@andrewsu
Copy link

Currently being evaluated by the chem-info group (slack link). tagging @vdancik to update feasibility/ideas when they have a chance to review...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants