-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Semantic Type Matching #6
Comments
Andrew's comment in Dec 20 Meeting:
Yao's TODO: report no. of removed predications. |
Total number of predications:
N.B. Node normalizer lookups not performed yet. Now consider predictions with replaced CUIs:
|
For posterity (and only if it's easy to generate), can you provide some summary of the 669,924 predications that would be discarded? For example, are most of those because But regardless, I'm comfortable moving forward with the exact semtype matching... |
Top 10 discarded subject semtypes
Top 10 discarded object semtypes
Top 10 discarded predicates
Top 20 discarded predications (as in triples of
|
SUBJECT_SEMTYPE | SUBJECT_SEMTYPE_NAME | PREDICATE | OBJECT_SEMTYPE | OBJECT_SEMTYPE_NAME | count |
---|---|---|---|---|---|
hcro | Health Care Related Organization | LOCATION_OF | resa | Research Activity | 53532 |
hcro | Health Care Related Organization | LOCATION_OF | lbpr | Laboratory Procedure | 31750 |
hcro | Health Care Related Organization | LOCATION_OF | diap | Diagnostic Procedure | 23061 |
fndg | Finding | PROCESS_OF | humn | Human | 20781 |
dsyn | Disease or Syndrome | PROCESS_OF | humn | Human | 11115 |
bpoc | Body Part, Organ, or Organ Component | LOCATION_OF | fndg | Finding | 9506 |
tisu | Tissue | LOCATION_OF | aapp | Amino Acid, Peptide, or Protein | 9332 |
gngm | Gene or Genome | LOCATION_OF | genf | Genetic Function | 9061 |
genf | Genetic Function | PROCESS_OF | gngm | Gene or Genome | 8734 |
mobd | Mental or Behavioral Dysfunction | PROCESS_OF | humn | Human | 7135 |
bpoc | Body Part, Organ, or Organ Component | LOCATION_OF | patf | Pathologic Function | 6732 |
bpoc | Body Part, Organ, or Organ Component | LOCATION_OF | anab | Anatomical Abnormality | 6377 |
lbpr | Laboratory Procedure | USES | lbpr | Laboratory Procedure | 6187 |
aapp | Amino Acid, Peptide, or Protein | AUGMENTS | ortf | Organ or Tissue Function | 6134 |
lbpr | Laboratory Procedure | USES | aapp | Amino Acid, Peptide, or Protein | 5792 |
bpoc | Body Part, Organ, or Organ Component | LOCATION_OF | aapp | Amino Acid, Peptide, or Protein | 5454 |
blor | Body Location or Region | LOCATION_OF | fndg | Finding | 5253 |
topp | Therapeutic or Preventive Procedure | TREATS | dsyn | Disease or Syndrome | 4388 |
dsyn | Disease or Syndrome | PROCESS_OF | mamm | Mammal | 4315 |
bpoc | Body Part, Organ, or Organ Component | LOCATION_OF | dsyn | Disease or Syndrome | 4149 |
All
|
CUI1 | concept_name1 | semantic_type_abbreviation1 | CUI2 | concept_name2 | semantic_type_abbreviation2 |
---|---|---|---|---|---|
C1552516 | Specialty Group | hcro | C0220961 | UMLS Metathesaurus | inpr |
C1516172 | Cancer Center | hcro | C1513817 | NCI-Designated Cancer Center | hcro |
C0237680 | Residential Care Institutions | hcro | C0035186 | Residential Facilities | hcro |
C0237680 | Residential Care Institutions | hcro | C0035186 | Residential Facilities | mnob |
C1609437 | Primary care clinic | hcro | C1552443 | Clinic / Center - Primary Care | mnob |
C1609437 | Primary care clinic | hcro | C1552443 | Clinic / Center - Primary Care | hcro |
C0872261 | repository | hcro | C3847505 | Repository | mnob |
C1306377 | Postoperative anesthesia care unit | hcro | C0034871 | Recovery Room | hcro |
C1306377 | Postoperative anesthesia care unit | hcro | C0034871 | Recovery Room | mnob |
C1552447 | radiology facility | hcro | C1610162 | Radiology Clinic/Center | mnob |
C1552447 | radiology facility | hcro | C1610162 | Radiology Clinic/Center | hcro |
C0013967 | Emergency Service, Hospital | hcro | C0562508 | Accident and Emergency department | hcro |
C0013967 | Emergency Service, Hospital | hcro | C0562508 | Accident and Emergency department | mnob |
C0338036 | Doctor's office | hcro | C0031834 | Physicians' Offices | hcro |
C0338036 | Doctor's office | hcro | C0031834 | Physicians' Offices | mnob |
C1546895 | GlaxoSmithKline | hcro | C1552903 | SmithKline Beecham | hcro |
C1619637 | Hospital Psychiatric Units | hcro | C0870667 | Psychiatric hospital unit | hcro |
C1619637 | Hospital Psychiatric Units | hcro | C0870667 | Psychiatric hospital unit | mnob |
C1546882 | NABI | hcro | C1552896 | NABI | hcro |
C1546858 | Abbott Laboratories | hcro | C1552881 | Abbott Laboratories | hcro |
C1546873 | Merieux | hcro | C1552891 | Merieux | hcro |
C1512798 | Institute for Cancer Prevention | hcro | C1140168 | NCI Thesaurus | inpr |
C1546884 | Novartis Pharmaceutical Corporation | hcro | C1552897 | Novartis Pharmaceutical Corporation | hcro |
C4699045 | Stroke Center | hcro | C1136323 | Logical Observation Identifiers Names and Codes | inpr |
C3845566 | Rehabilitation facility | hcro | C0034993 | Rehabilitation Centers | hcro |
C3845566 | Rehabilitation facility | hcro | C0034993 | Rehabilitation Centers | mnob |
Great, this all looks fine to move forward. Please proceed with the update! |
Thank you for the confirmation! |
A CUI can have multiple semantic types. E.g.
When a retired CUI is mapped to a multi-semantic-typed CUI, it's reasonable to match the semantic types for precise replacement. E.g.
However, Colleen also mentioned that some highly related semantic types should be considered as matched. E.g.
The
aapp
➡️gngm
match is also worth consideration.Colleen and I came to the idea that:
(C0021740,aapp)
➡️(C3539881,aapp)
.(C1705981,aapp)
➡️(C3539881,gngm)
@newgene @andrewsu do you have any idea on the matching conditions? Or shall we carry out exact matching for all replacement? How about explicitly whitelisting? Appreciate your thoughts!
Note that the replacement can occur to either subjects or objects, so the matching conditions may affect the semantic meaning of those involved predicates.
The text was updated successfully, but these errors were encountered: