-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Coding: Add RDF-ization code to convert mappings to RDF #34
Comments
@bill-baumgartner and @nicolevasilevsky hoping to get your feedback on my proposal for logically validating the mappings, which is described, with examples below. I am also including @LEHunter so he knows my plan. The reason I am adding this last step is two fold: (1) the use of reasoners as logical validation was Melissa's idea (and hinted at by a an OHDSI reviewer), which I love and (2) the resulting mappings will compatible with PheKnowLator. BackgroundWe have generated mappings for several clinical domains (i.e. conditions, drug ingredients, and measurements). Each domain is mapped to a different set of ontologies and several types of mappings were created. Mappings that included more than a single ontology concept (many ontology concepts to one clinical concept) were constructed using Clinical Domains and Ontologies
Mapping Categories
Mappings also have evidence, which varies according to the mapping type. An example is shown in the table below:
Analysis PlanFor each ontology, I will create a new ontology class for all mappings including 2 or more ontology concepts (since single concepts already existing within each ontology). Reasoner(s):
Output and Record Statistics:
ASK:
Creating Release Version RDFOnce the validation described above is complete, I will create a more complex version of the mappings that spans all ontologies for a given mapping, rather than creating ontology-specific mappings. Creating mappings that span multiple ontologies will require some additional content not currently included in each mapping. I realize that there are many ways one could approach this, this is what I think is the easiest, quickest, and most clean/transparent. Each class created from this process with be done so under the OMOP2OBO namespace and, include the official OMOP concept label, synonyms, and have the OMOP concept and source codes assigned as DbXRefs (all of this we get for free from the mappings). Steps: Relations to Connect Mappings Spanning Multiple Ontologies CONDITIONS DRUG INGREDIENTS
MEASUREMENTS
ASK:
|
@LEHunter - can we please talk through how to align the mapping categories and evidence on Wednesday? Tables for each are re-printed below: Mapping Categories
Evidence Types
|
@callahantiff One minor thing, I think MONDO has phenotype HPO but I don't know if it makes sense to say a phenotype has phenotype a disease. |
Thanks so much for your feedback @nicolevasilevsky! Good point about Mondo and the Very excited to share the results with you! Oh, did you see I created a figure 1 to cover the mapping method? I'd love to know what you think. You can access it here. |
@bill-baumgartner - thanks for your help on Friday! I think we have a great plan! The general assumptions for all clinical domains (i.e. conditions, medications, and measurements) are shown below. Updates/procedures for each specific clinical domain will be presented in 3 separate comments below, just to make it less overwhelming :D General Steps1 - Merge all ontologies together (ontologies listed in table above) |
ConditionsOntologies: Assumptions:
Intra-Ontology Relations:
Mapping Combinations: Class Construction Heuristics:
Don't Map
ASK:
|
MedicationsOntologies: Assumptions:
Intra-Ontology Relations:
Mapping Combinations: Class Construction Heuristics:
Don't Map
ASK:
|
MeasurementsOntologies: Assumptions:
Intra-Ontology Relations:
Mapping Combinations: Class Construction Heuristics:
Don't Map
ASK:
|
Looking good @callahantiff! Just a few questions:
|
Thanks so much @bill-baumgartner!
Great question.
I was thinking of creating the following subclass relations:
Good catch! I updated the figure above. Do you agree with that? Does this cover the human IgE resulting from non-human response? Thanks so much for all of your help! 😄 🙇♀️ |
Yep! Updated figure above. Although this will be somewhat tricky to distinguish from other mappings including cells. Even a red blood cell count is still measured from a blood sample. Noting that here so I make sure that I don't forget that. |
@bill-baumgartner -- for our meeting tomorrow, we are planning to discuss representing: Mapping CategoriesMapping categories added as class annotation. Mapping EvidenceEvidence can come in the following forms:
HOW TO REPRESENT THESE So something like: DbXRef Example 1: OBO_DbXRef-OMOP_CONCEPT_SOURCE_CODE:ABC_1234567
DbXRef Example 2: OBO_DbXRef-OMOP_ANCESTOR_SOURCE_CODE:ABC_1234567
Label Example: OBO_LABEL-OMOP_CONCEPT_LABEL:xxxxxxx
OBO Synonym Example: OBO_hasSynonymType-OMOP_CONCEPT_LABEL:xxxxxxx
Similarity Example: CONCEPT_SIMILARITY:OBO_URI_1.0
|
@bill-baumgartner - starting this work on branch |
@bill-baumgartner - refined the representation. Hoping to go over this on Monday with you (click to enlarge). I have also created the following Wiki pages for this content: UPDATE: Figure updated to reflect discussion with @linikujp ( |
Small Update: Will be using the |
@callahantiff the Chemical "has component" Vaccine doesn't sound right to me. Thanks, |
Hi @linikujp! So excited to have your feedback on this! Happy to explain our logic on this decision and equally happy to get your thoughts on it 😄 I have been working on how to include examples for the knowledge representation figure and just last week decided that we will include a figure caption that provides examples for each square in the figure. Your comment has helped me to realize that I left out an important component which I think is causing confusion. I have updated the figure and included it below. Please let me know if this is more clear. OK, to your question. In general, when representing drug ingredients, we make the assumption that all Example 1: Varicella-Zoster Virus Vaccine Live (Oka-Merck) strainThis vaccine ingredient is represented using the CHEBI immunogen Example 2: Gelatin, Iron, Catalase, RhoRepresenting a drug exposure ingredient that is not a vaccine (e.g. gelatin, iron, catalase, or rho), but have classes in the VO. The VO includes terms that explicitly represent these ingredients, which we model as being components of particular CHEBI Does that help clarify the different representations? As mentioned above, I have edited the figure for Drug Exposure Ingredients to be more clear about this distinction. Does this help make the above distinction a bit more clear? |
Hi @callahantiff, It makes better sense now. But I am afraid that your relation "has component" holds a different meaning as how I understand it. Do you use the RO relation here: http://www.ontobee.org/ontology/RO?iri=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FRO_0002180 ? both a role and a chemical entity has component to either a protein or vaccine is problematic from BFO's structure as well as OBO's. From what you said "drug exposure ingredients are either some kind of CHEBI chemical entity or they have some kind of CHEBI role (all Drug Exposure ingredients have at least one CHEBI annotation)." Do you have to have all terms from CHEBI for this case? How do you represent drugs? Thanks, |
Hi @linikujp - I see your point. Let me check in with my project mentors and get back to you. We had initially come an agreement after some discussion on this, but there have also been some changes since then. Thanks again for raising these points and helping me to make this better. -Tiffany |
Dear @linikujp - Thank you again for your feedback and your great suggestions! Over the last few days I met with my team and we spent a lot of time thinking about the drug and drug ingredient representations with respect to the points that you brought up. We definitely agreed that we did not have things quite right and have overhauled our initial representation. As best we could, we also incorporated some of your suggestions as well. This new representation (shown below) explicitly models the different subdivisions of the CHEBI hierarchy, which we feel is important to highlight as it requires slightly different logic patterns. We also explicitly model how drugs relate to ingredients, which should provide a much more clear picture of our overall approach. |
Dear @callahantiff I'd love to follow-up and see how your implementation goes with this apporach. Thanks, |
Absolutely, I think we are at a great place to start testing the representation. We have some good experiments planned, which I will be focusing on completing over the next few weeks. I will definitely reach out and share the results when those are complete! Thank you again for your feedback and for helping make this work stronger. I am very appreciative! 😄 |
@callahantiff Cool! & +1 |
Needed Scripts: Write a script that converts mappings into RDF
@nicolevasilevsky -- thank you for meeting with me a few weeks ago and confirming our approach looks reasonable. I am just documenting this here as an issue since it's work I still need to do.
Planned Approach
NOT()
Details: Only occurs within the HP and only for Measurement and Drug domains
class_IRI: https://github.com/callahantiff/omop2obo/obo/ext/OMOP_4021360
Class_Name: 'Skin appearance normal'
Class Expression Syntax: not('Abnormality of the skin')
New Triples:
OR()
Details: Only occurs within DOID and HP and only for the Condition domain
class_IRI: https://github.com/callahantiff/omop2obo/obo/ext/OMOP_434473
Class_Name: 'Longitudinal deficiency of tibia AND/OR fibula'
Class Expression Syntax:
('Abnormality of fibula morphology' or 'Abnormality of tibia morphology')
New Triples:
AND()
Details: Occurs within all ontologies and domains
class_IRI: https://github.com/callahantiff/omop2obo/obo/ext/OMOP_434165
Class_Name: 'Abnormal cervical smear'
Class Expression Syntax:
('Abnormal cell morphology' and 'Abnormality of the uterine cervix')
New Triples:
AND()/OR()
Details: Only occurs within DOID and HP and only for the Condition domain
class_IRI: https://github.com/callahantiff/omop2obo/obo/ext/OMOP_77072
Class_Name: 'Joint effusion of ankle AND/OR foot'
Class Expression Syntax:
New Triples:
AND()/NOT()
Details: Only occurs within DOID and HP and only for the Condition domain
class_IRI: https://github.com/callahantiff/omop2obo/obo/ext/OMOP_4120313
Class_Name: 'Non-diabetic disorder of endocrine pancreas'
Class Expression Syntax:
'Abnormality of the pancreas' and not('has phenotype' some 'Diabetes mellitus')
New Triples:
AND()/OR()/NOT()
Details: Only occurs within DOID and HP and only for the Condition domain
class_IRI: https://github.com/callahantiff/omop2obo/obo/ext/OMOP_435352
Class_Name: 'Periostitis without osteomyelitis, of the pelvic region and/or thigh'
Class Expression Syntax:
New Triples:
The text was updated successfully, but these errors were encountered: