Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mesh Proteins as chemicals #200

Open
cbizon opened this issue Oct 24, 2023 · 3 comments
Open

Mesh Proteins as chemicals #200

cbizon opened this issue Oct 24, 2023 · 3 comments
Assignees

Comments

@cbizon
Copy link
Contributor

cbizon commented Oct 24, 2023

See NCATSTranslator/Feedback#613 NCATSTranslator/Feedback#614 NCATSTranslator/Feedback#615.

These are all proteins, which under biolink are biological entities, but we're calling them chemicals. I think that this is probably just never cleaned up from when protein went over into the biological entity branch.

@cbizon
Copy link
Contributor Author

cbizon commented Nov 1, 2023

I took a look at the first one here. https://id.nlm.nih.gov/mesh/D011972.html (Insulin receptor). According to the mesh code, this MESH id should not be included as a Chemical. As that URL shows, the Tree values are D12.776 and D08, both of which are excluded in the chemical.py mesh filter. Not sure at this point whether the MESH is somehow getting into the chemical id list or if we're looking at an old result somehow or what.

@cbizon
Copy link
Contributor Author

cbizon commented Nov 1, 2023

OK, what I think is going on is that the MESH terms are correctly being put under Protein, but the UMLS are still getting called ChemicalEntities. Then the MESH terms are getting dragged along via a mapping. And I think that the reason that the UMLS are not working corrrectly is that our list of UMLS Tree id's doesn't use excludes. So Insulin Receptor has three listings in MRSTY:

C0034818|T116|A1.4.1.2.1.7|Amino Acid, Peptide, or Protein|AT17641609|256|
C0034818|T126|A1.4.1.1.3.3|Enzyme|AT17738045|256|
C0034818|T192|A1.4.1.1.3.6|Receptor|AT17615610|256|

So even though we don't let in Receptor, we do let in Enzyme. We need to instead say "if you are a receptor, you don't go here, no matter what your other listings say"

@cbizon
Copy link
Contributor Author

cbizon commented Nov 1, 2023

It also looks like 1.4.1.2.1.7 is being grabbed by protein. So basically we need to

  1. make sure that this branch of UMLS is correctly divided between chemical.py and protein.py,
  2. correctly handle exclusions at the UMLS id level
  3. Ensure that somehow the UMLS/MESH versions of proteins merges with the UNIPROT/PR/HGNC versions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants