Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for qualifiers #325

Closed
wants to merge 2 commits into from
Closed

Support for qualifiers #325

wants to merge 2 commits into from

Conversation

vdancik
Copy link
Collaborator

@vdancik vdancik commented Apr 7, 2022

I propose to separate the qualifiers from other attributes to make it easier to express the full semantics of complex statements in the Biolink model.

Copy link
Collaborator

@edeutsch edeutsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need some more information from the Data Modeling Group on exactly what qualifiers look like before we can design something. Looking at the proposal, I'm thinking that qualifiers have an "aspect" and a "direction" which are both ontology terms. And they don't have values. So this seems rather unlike Attributes. So I am hesitant to model qualifiers as attributes. See: https://docs.google.com/document/d/1LTzwoRoHQOWtpr23q_CzWzJZHgBhEoTS2iCy5Mn5t8Q/edit#heading=h.cxntx0nr2fjg
subject: Methionine
subject_aspect: abundance
subject_direction: decreased
predicate: affects
qualified_predicate: causes
object: ADRB2
object_aspect: expression
object_direction: increased

@sierra-moxon
Copy link
Member

where "qualifiers" are a collection of attribute-objects, just separated from other TRAPI attributes. (but following the same structured representation of a TRAPI attribute object).

@edeutsch
Copy link
Collaborator

@sierra-moxon are the characteristics "subject_aspect" and "subject_direction" fairly rigid and complete, or are there potentially many such characteristics? (and I use the word "characteristic" in a general sense instead of a Biolink term of art, if there is one)

@sierra-moxon
Copy link
Member

I imagine there will be a few more besides aspect, specialization, and direction, but we'll avoid an over-proliferation of qualifier types. Is the question targeted at figuring out which model components to separate into the "qualifier" section in TRAPI?

I do imagine we'll make a grouping slot in Biolink Model for all "qualifier types" with a relatively flat set of children. Something like:

qualifier:
     subject_aspect:
     object_aspect:
     subject_direction:
     object_direction:
     subject...

The qualifier collection in TRAPI could then be defined as all attributes with the Biolink Model type 'qualifier' or one of its children.

@edeutsch
Copy link
Collaborator

I suppose my main point of musing is whether it makes sense to try to re-use Attribute. The proposal is for yes. but I'm thinking no. Attribute has the following properties:

  • attribute_type_id
  • original_attribute_name
  • value
  • value_type_id
  • attribute_source
  • value_url
  • description
  • attributes

This seems like overkill for qualifiers. I suppose I'm trying to establish whether we need any more than two properties for qualifiers. Maybe all we should have is two properties:
Qualifier:

  • qualifier_type (subject_aspect, subject_specialization, subject_direction, qualified_predicates, etc. selected from a set of allowed values in Biolink)
  • qualifier_term (abundance, expression, increases, causes, etc. selected from a set of allowed values in Biolink)

Is that perhaps all we should have?

Alternatively, maybe it would be easier or better to bake a rigid Qualifier class into TRAPI with all the properties we envision:
Qualifier:

  • predicate_qualifier:
  • subject_aspect:
  • object_aspect:
  • subject_direction:
  • object_direction:
  • subject_specialization:
  • object_specialization:

and each Edge may only have one Qualifier object. But the structure is locked down. This prevents some semantically invalid things like having two predicate_qualifiers or two subject aspects, which is legal in a list of Qualifier(qualifier_type, qualifier_term) but non-sensical.

A key question is how rigid is the list of qualifier types going to be?

@ehinderer
Copy link
Contributor

I think the idea of the qualifier object having a set structure is a good idea, but I'm wondering how it would get complicated in the "onion" model of the qualifier proposal. I think the plan was to also have qualifiers at the statement and metadata level. Would the same "qualifier" object be used for all levels? If so, there would need to be additional properties to cover those as well--e.g., statement_qualifier, statement_quantifier, etc. I suppose once all properties are defined, a generic qualifier object could be created, but would have null values for those instances where the qualifier property isn't applicable. But other semantically invalid things would emerge, like a subject node having a non-null statement qualifier.

@mbrush
Copy link
Collaborator

mbrush commented May 3, 2022

@ehinderer makes a good point. The qualifiers Eric included in his Qualifier object above are just the set that refine/extend the meaning of the subject or object concept in an Association. The set of Statement-level qualifiers will be more numerous, as they will be tailored specifically for different types of Associations. This set will continue to expand as we come across new Statement types and use cases, whereas the subject/object qualifiers are intended to be a small set of more general purpose slots, and should not need to be expanded to accommodate new sources/types of Statements.

@ehinderer
Copy link
Contributor

I mentioned this in a discussion with some of our engineering contractors. I made the analogy that these collections of properties on each qualifier type is like and abstract base class. The suggestion was that each type of qualifier should contain only those properties which were necessary and sufficient for their use. Maybe the way we do this is enumerating ALL types of qualifier classes in the qualifier model and then enumerating the properties each MUST contain and which properties each MAY contain? At that point we would have a "rigid" qualifier model that could be incorporated into TRAPI.

@sierra-moxon
Copy link
Member

Are our association classes analogous here w/re to the "rigid qualifier model"?

ChemicalAffectsGeneAssociation
subject: Chemical Entity 
subject_specialization: enum <<ChemicalSpecializationEnum>>
subject_aspect:  enum <<ChemicalAspectEnum>>
subject_direction: enum <<DirectionEnum>>
predicate: affects
qualified_predicate: causes
object: Gene or Gene Product
object_specialization: enum <<GeneSpecializationEnum>>
object_aspect:  enum <<GeneAspectEnum>>
object_direction: enum <<DirectionEnum>>
control_mechanism_qualifier:   enum: <<ControlMechanism>>
 	      context_qualifier: enum <<Context>>

@ehinderer
Copy link
Contributor

Well, from that example (and my experience with the data modeling group discussions) I would infer that the following qualifier types/classes (and their properties) exist:

subject_qualifier (points to a subject node in a statement)
- subject_specialization
- subject_aspect
- subject_direction
predicate_qualifier (points to the predicate in a statement)
- qualified_predicate
object_qualifier (points to an object node in a statement)
- object_specialization
- object_aspect
- object_direction
statement_qualifier (points to a statement triple)
- control_mechanism_qualifier
- context_qualifier

@mbrush
Copy link
Collaborator

mbrush commented May 4, 2022

@ehinderer I'll admit I don't quite follow the analogy you draw between qualifier types/properties and the notion of an abstract base class, and your proposal based on this - but am keen to follow up about it.

As for my current preference for representing qualifiers in TRAPI, I would favor a dedicated qualifier property on the Edge object, as @vdancik proposes in this PR, and defining a Qualifier type to use as the value of this property, which holds just the two properties (Qualifier.qualifier_type, and Qualifier.qualifier_value).

We could of course use the existing Attribute object instead, as it has at its core a key-value pair of properties - but I agree with @edeutsch that the additional Attribute fields are not relevant for qualifiers. I'd also prefer not to characterize critical qualifier semantics that contribute to the core Statement put forth in an Association, as mere 'Attributes' - IMO qualifiers are special and distinct enough from other types of info captured in Attributes to warrant their own 'type' in the model.

In summary, this approach would require the following:

  • a new property called qualifier be added to the Edge class in the TRAPI spec
  • a new Qualifier class, with two properties of its own (qualifier_type, and qualifier_value) be added to the TRAPI spec
  • Qualifier.type values would be drawn from a hierarchy qualifier slots to be added to the Biolink model (e.g. subject_aspect, subject_direction, object_aspect, object_direction etc).
  • Qualifier.value values would be drawn from enumerations defined in the Biolink model.
  • These enums get bound to qualifier slots in the context of a specific Association type - e.g. the ChemicaltoGeneAssociation.object_aspect slot would be bound to an enum that holds terms representing aspects of Genes/GeneProducts.

This is not a rigid model - as the semantics of all qualifier slots are still defined in the data, and not baked into the TRAPI spec. Accordingly, it requires minimal change to the TRAPI spec. But as our use of this flexible approach matures, and patterns of use emerge, it may be that we can move toward integrating more semantics into the spec itself where it is useful to do so.

@ehinderer
Copy link
Contributor

ehinderer commented May 4, 2022

@mbrush It's entirely possible that I'm misunderstanding something, or it's simply a bad analogy, so no worries. I think I'm leading toward it being a bad analogy, because your proposal is in line with what I have in mind.

edeutsch added a commit that referenced this pull request May 5, 2022
Alternative to #325, based on discussion in #325
@edeutsch
Copy link
Collaborator

At the 2022-05-05 call it was decided that the #326 proposal is favored. This PR proposal will be CLOSED without merging. But leaving it open for a little while longer for visibility.

@edeutsch
Copy link
Collaborator

Closing this proposal in favor of #330

@edeutsch edeutsch closed this Jun 15, 2022
@edeutsch edeutsch deleted the qualifiers branch July 6, 2022 03:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants