-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DNA extent OWL definition #6
Comments
the has-part-only issue is more apparent on this axiom:
This is a fun one because of the recursivity. But the problem should be apparent. If chebi was to add a perfectly valid |
The way the 'DNA extent' is currently defined, the following two classes would be inferred as subclasses/instances of it:
(Only merely says there should not be any relations that do not confirm to the range of the only expression - so if there are non at all, the condition is fulfilled)
would be a subclass of DNA extent Are these two implications intended? |
@cmungall I couldn't come up with a way to formally define 'sequence molecular entity extent' (which is a continuous string of biological sequence units, either as a whole molecular entity or as a subsequence), but I wanted to formally define the extent subtypes as extents composed of specific types of sequence units, which is what I think this does. For 'DNA extent', I essentially wanted to say that it's a SMEE whose sequence units are (exclusively) deoxyribonucleotide residues. I agree that using transitivity and 'only' is usually problematic since parthood propagates all the way down, as you note. I've taken this into account by saying that the only parts of DNA extents are either deoxyribonucleotide residues, or they're chemical entities or biological sequence entities (the two main top-level classes of ChEBI and MSO, respectively) that are not biological sequence units. Thus, this definition still allows for parts of DNA extents that aren't deoxyribonucleotide units (e.g., other extents or regions, chemical groups, atoms, electrons, quarks, etc.). The only restriction is that the parts have to be either chemical entities or biological sequence entities, which doesn't seem unreasonable: ChEBI even already includes atoms and subatomic particles, so I think that, e.g., spaces between atoms would still be within its domain even if they're not explicitly represented now. Additionally, the MSO already has immaterial entities in the form of boundaries of sequence residues, specifically, junctions and termini, for things like chromosomal breakpoints and deletions. If we really had to, we could expand the union to include, e.g., BFO sites or whatever, but I'd say that's currently a nonexistent problem. As for 'genomic DNA extent', it has a similar format to that of 'DNA extent', except that it uses 'group' instead of ('chemical entity' or 'biological sequence entity') as in 'DNA extent'. I was previously using 'group' in the object of the 'has part' expression, but later expanded it to ('chemical entity' or 'biological sequence entity'); I just hadn't updated the axioms for 'genomic DNA extent' yet. However, even with 'group', I don't see how genomic DNA extents would be classified as atoms with your presented axioms... As to the reasons for the relatively complicated axiomatization, I'd first say that it's pretty close to the semantics I was trying to get; e.g., for 'DNA extent', that it's a SMEE composed of deoxyribonucleotide units. (The natural-language definition perhaps needs to be edited to match better.) However, it was also done for practical inferential reasons: With this axiomatization, along with others I've recently added, the ontology now knows how to properly connect the various types of molecular entities, extents, regions, and residues. For example, it knows that extents of DNA molecules have to be DNA extents, that regions of DNA molecules have to be DNA regions, and that residues of DNA molecules have to be deoxyribonucleotide residues (plus, using the inverse of 'has part', the reverse assertions are inferred as well). This reflects what we know, and results in some really useful inference, I think. For example, 'cDNA region' is defined only as a 'sequence molecular entity region' that's part of some cDNA; however, now that the ontology knows that any region of a DNA must be a DNA region, it can classify 'cDNA region' under 'DNA region', which it couldn't do before all of this axiomatization, so I think that's pretty cool. |
@matentzn 'DNA extent' is also a subclass of
so with that I believe your presented classes wouldn't be classified as DNA extents. (Additionally, it currently doesn't, but its parent 'sequence molecular entity extent' should correspondingly be a subclass of 'has part' some 'biological sequence unit'.) I'm not claiming that the definitions under discussion are totally immune from ill inferential effects, but I'd be interested in examining inferential issues you can think of regarding these definitions when combined with other reasonable (no pun intended) assertions. |
@cmungall @matentzn One issue of which I'm aware is that these definitions still lead to the classification of SMEEs that have inappropriate types of chemical entities or biological sequence entities as parts. For example,
(which is obviously nonsensical) would still be classified as a DNA extent. I'm still thinking of how I can further refine these to avoid this... |
To see the problem with genomic DNA extent:
This injects an abox of a genomic extent with one group to demonstrate the inconsistency. Alternatively you could load just the tbox and do a DL query: presumably this is not the intent |
I'd recommend not refining further - owl definitions have to be understood by humans as well as machines. What about a simple EL pattern using has_member? Treat extents as mereological sums of like units. You would get less constraints off the bat, so if that's a requirement there may be a way to reintroduce these as disjointness GCIs or hidden GCIs |
But sequence molecular entity extents aren't disjoint with CHEBI groups; in fact, 'sequence molecular entity region', which is a child of 'sequence molecular entity extent', is explicitly asserted to be a subclass of 'group'. Would there still be a problem if the 'genomic DNA extent'/'group' disjointness axiom were removed? |
how about:
|
@cmungall But I noted that I haven't yet expanded the 'group' conjunct to the wider ('chemical entity' or 'biological sequence entity'), as I've done for the other definitions. I think that fixes it, right? That being said, these definitions are problematic at least for the issue I noted above. The only other way I can currently think of to get the inference I'm seeking is to use specialized 'has part'/'part of' subrelations to refer to specific types of parts, e.g., 'has residue part'/'residue part of'. Is this strategy of defining and using specific partonomic relations considered OBO-kosher? It seems that these would be subrelations of 'has component'/'component of', right? (I think using the latter are problematic in that they seem to require human interpretation as to which components they're referring to.) |
@cmungall After toying around some, I think that the aforementioned types of inference might be possible using disjointness axioms instead, which you previously mentioned, e.g.:
What do you think? |
I have some questions about this axiom:
The text was updated successfully, but these errors were encountered: