-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
proposed design pattern design patterns / naming conventions #67
Comments
I think this is now supported. But they are a bit underspecified as strings - and the pattern library may not have access to the contextual axioms required to perform a test, for example if the examples are HP examples and the ontology is XPO (importing uPheno patterns). So XPO running a unit test checking an HPO class would not make sense. Better to say: could be used for unit tests. |
I like all of this what you are proposing. One thing that is really hard for me though is get people to write better pattern descriptions. Can we come up with a grammar for pattern human readable descriptions making to the gd patterns for term definitions which people can copy paste and fill in? Your example "This pattern is for classes representing leiomyosarcomas differentiated by where they are found in the body. leiomyosarcomas are uncommon, aggressive malignant smooth muscle neoplasms" sounds like its following this pattern:
Seems a bit bare, as one of the key features of the definition should be to differentiate the pattern from similar patterns. So, I would like a sentence like:
For example, for the "increased rate" pattern:
|
Yes, I think negative examples are always good.
|
rationale
names will be exposed in documentation for editors, curators, and general end-users who wish to understand the general patterns of the ontology, so it is important to use consistent, clear, non-ontologist language
these will also be exposed as headers in TSVs
many of the conventions that apply to programming languages and data models and schemas apply here
it is also useful to think of patterns as metaclasses, whose instances are owl classes. The pattern instances may also correspond to what scientists think of as 'entities'.
principles
always use meaningful names
v
. Spell it out.exception: filename / IRI should use underscores not spaces. however, for human-readable labels change the underscores to spaces
vars should be named by relationship or role of range, not range itself
e.g. if an disease pattern has a variable to specify the location and the range is anatomical structure, call it
location
notanatomical structure
rationale: later on you may want a sub-pattern where you have a separate var with the same range
genus vars should be named non-generically
E.g disease by location, with 2 vars, one for the genus, the other for the location. Do not call the genus var 'disease'. call it something like
also consider something like "morphological type" if that describes the genus relation
rationale: just 'disease' is too broad. See previous principle. later we may add a sub-pattern that references another disease
name the pattern by the identity criteria
the set of vars that are used in the equivalence axiom constitute the compound key. these are the identity crietria
e.g a pattern for subtyping leiomyosarcomas by location. do not call this 'leiomyosarcoma'. call it 'leiomyosarcoma by location'
rationale: we may later add leiomyosarcomas subtyped by gene. we can't have two called 'leiomyosarcoma'
in general a good pattern is to name the pattern by the sequence of elements in the equivalence axiom, where the elements are the named classes (the things in single quotes) and var names.
Some recommended changes for mondo patterns
use consistent vocabulary
e.g "adult form of disease" is OK as a name. "adult variant of disease" is not good if we use variant to mean a non-subclass variant
long names are not necessarily bad
we don't pay for characters, don't worry too much about length, within reason
use the term specific as appropriate (TBD)
consider a pattern name 'cancer by location'. This is ambiguous. Do we mean:
consider prefixing with "specific"; e.g. the first would be called "specific cancer by location"; alternatively "cancer subtype by location"
Perhaps we should even call the 2nd "cancer (general) by location" (TBD.. this is awkwrd)
avoid X in name
always use a meaningful name
use the same filename as pattern name
description should describe the pattern instances not the class instances
E.g.
mondo leiomyosarcoma
https://mondo.readthedocs.io/en/latest/editors-guide/patterns/leiomyosarcoma/
An uncommon, aggressive malignant smooth muscle neoplasm, usually occurring in post-menopausal women that is characterized by a proliferation of neoplastic spindle cells that is located in a specific anatomical location.
This is not a good pattern description, it describes to leiomyosarcomas, not leiomyosarcoma classes
Instead:
This pattern is for classes representing leiomyosarcomas differentiated by where they are found in the body. leiomyosarcomas are uncommon, aggressive malignant smooth muscle neoplasms
include motivation
E.g. leiomyosarcomas can occur in different sites in the body so we include this pattern to...
include examples
As well as auto-examples, include manually selected examples that highlight key aspects
TODO: we should have a specific field for listing this. These should then be used as unit tests
include minimal metadata
document rules
some patterns may be associated with rules: sparql, regexes, python, ... document these
be specific with range constraints
avoid owl:Thing
consider unions rather than going up the hierarchy if a specific class doesn't exist
challenges: for upper level terms we want to use cob but it is not yet ready
be careful with specifying things too specific and accidentally forcing some classes not to be matched. This is why examples / unt tests (see above) are vital
patterns should be disjoint
this is more of an aspiration at the moment
consider 2 patterns
any class that conforms to the 2nd will also conform to the first. Ideally we could extend dosdp to be able to say: the range of this class is a proper subclass of cancer
The text was updated successfully, but these errors were encountered: