Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update modeling to reflect qualifier changes for chemical -> gene associations #512

Closed
andrewsu opened this issue Oct 4, 2022 · 33 comments
Assignees

Comments

@andrewsu
Copy link
Member

andrewsu commented Oct 4, 2022

This is a new issue to track our team's implementation of this issue on qualifiers for chemical-gene associations: NCATSTranslator/ReasonerAPI#348

From that issue:

The schema changes are here: https://github.com/NCATSTranslator/ReasonerAPI/pull/330/files
and there are some examples here: https://github.com/NCATSTranslator/ReasonerAPI/tree/master/examples/Message

According to the progress tracker, the target completion date for this issue is 2022-10-25

related/broader issue on constraints: #482

@andrewsu
Copy link
Member Author

andrewsu commented Oct 5, 2022

Steps

  1. update SmartAPI annotations (Target completion as of 10/11 is 11/1)
    • Minimal solution: find/replace examples of "targets", "upregulates", etc.
    • TODO: @colleenXu to find/create a single example of an API edge that needs to be changed; check a modified version into a branch so we can test using API overrides
  2. make sure BTE passes qualifiers (including qualified predicates, subject/object qualifiers) through in TRAPI results (Target completion as of 10/11 is 11/15)
    • need to make sure we handle edge reversal/inversion to match query direction
  3. BTE respects qualifiers in queries (Target completion as of 10/11 is 11/15)
    • check on reverse operations (if we have A - [upregulates] -> B, need to be able to query B - [upregulated_by] -> A) (SmartAPI annotations are explicitly declared both ways)
    • confirm whether Biolink 3.0 contains a hierarchy of qualifier_values (e.g., localization -> [tissue_localization, subcellular_localization])

@andrewsu
Copy link
Member Author

andrewsu commented Oct 7, 2022

Additional notes from my conversation with Sierra and Matt B.:

subject: Bisphenol A
predicate: affects 
qualified_predicate: causes
object: ESR1
object_aspect_qualifier: degradation
object_direction_qualifier: decreased

One interpretation of this record is the simple SPO triple -- Bisphenol A affects ESR1 -- and one is the fully qualified triple -- Bisphenol A causes decreased degradation of ESR1

@colleenXu
Copy link
Collaborator

@colleenXu
Copy link
Collaborator

colleenXu commented Oct 10, 2022

Biolink model lays out "chemical-gene associations" starting line 10301-10399, 11467-11485 (physically_interacts_with, affects, regulates)

For the possible values, it's not clear to me what the case is (snake_case?) UPDATE: they are supposed to be snake_case.

Subject (and object qualifiers):

  • subject_aspect_qualifier: see possible values in the list below
  • subject_form_or_variant_qualifier. Possible values:
    • modified form
    • mutated form
    • polymorphism
    • SNP
    • analog
  • subject_part_qualifier (line 1250)
    • 3' UTR
    • 5' UTR
    • polyA tail
    • promoter
    • enhancer
    • exon
    • intron
  • subject_derivative_qualifier (line 1240). Possible value: metabolite
  • subject_context_qualifier (line 1260)
  • subject_direction_qualifier (line 1270, 11648). Possible values:
    • increased
      • upregulated
    • decreased
      • downregulated

qualified_predicate

  • causes (for affects)? Inverse is caused_by

statement qualifier?

  • causal_mechanism_qualifier (line 1324, 11772). Possible values:
    • binding
      • inhibition
        • antibody inhibition
        • antagonism
        • molecular channel blockage
        • inverse agonism
        • negative allosteric modulation
      • activation
        • agonism
        • molecular channel opening
        • positive allosteric modulation
      • potentiation
      • inducer
      • transcriptional regulation
      • signaling-mediated control
  • anatomical_context_qualifier (line 1331, 11639). use UBERON.
  • species_context_qualifier (line 1341)

Possible aspect values:

  • activity or abundance
    • abundance
      • expression
      • synthesis
      • degradation
        • cleavage
        • hydrolysis
    • activity
  • metabolic processing
  • mutation rate
  • stability
  • localization
  • transport
    • secretion
    • uptake
  • modification
    • acetylation
    • acylation
    • alkylation
    • amination
    • carbamoylation
    • ethylation
    • glutathionylation
    • glycation
    • glycosylation
    • glucuronidation
    • N-linked glycosylation (not a child of glycosylation?)
    • O-linked glycosylation (not a child of glycosylation?)
    • hydroxylation
    • lipidation
    • farnesylation (not a child of another term?)
    • geranoylation (not a child of another term?)
    • myristoylation (not a child of another term?)
    • palmitoylation (not a child of another term?)
    • prenylation (not a child of another term?)
    • methylation
    • nitrosation
    • nucleotidylation
    • phosphorylation
    • ribosylation
    • ADP-ribosylation
    • sulfation
    • sumoylation
    • ubiquitination
    • oxidation
    • reduction
    • carboxylation

Migration guide should have full list of deprecated predicates

My thoughts:

  • Hmmm they kept "affects response to" stuff?
  • Big change for us:
    • the "regulates" stuff.
  • small change:
    • go through code to remove predicates that are deprecated from tests / src stuff? (approved_to_treat is one example -> it was deprecated and is now removed from biolink 3)

@colleenXu
Copy link
Collaborator

Pasting from Slack:

Also I realized that what we still have a "reversing" issue for Part B: we query in whatever direction we need to retrieve records using IDs. And then we "invert" the predicate so the record/Edge matches the QEdge direction...

But now we would also need to "invert" some qualifiers (change all the "subject" to "object" and vice versa) o_0.

@tokebe
Copy link
Member

tokebe commented Oct 14, 2022

I've pushed the branch add-qualifiers with experimental support:

Todo list:

  • Qualifiers from x-bte passed into BTE's TRAPI response
  • Qualifiers from TRAPI KPs confirmed working/supported
  • Qualifiers stored in Records and are reversible
  • BTE supports Qualifier constraints in queries
  • Write tests to ensure all new behavior is tested

@andrewsu
Copy link
Member Author

Also to clarify, the that the predicate hierarchy (with deprecated predicates viewable at http://tree-viz-biolink.herokuapp.com/predicates/deprecated-predicates) will define the allowed values for both the predicate and qualified_predicate slots.

@colleenXu
Copy link
Collaborator

Will likely want to update BTE to biolink-model 3.0.3 for Part 3 of qualifiers, so will want to be aware of this issue #514

@colleenXu
Copy link
Collaborator

Noting: qualifiers currently aren't included in the TRAPI /meta_knowledge_graph (which could affect Parts 2 and 3) and discussion is still active on the format for querying with qualifiers (Part 3).

Related discussions (TRAPI 1.4 upcoming work and release):

  • TRAPI 1.4 prioritizing topics
  • topic on qualifiers and TRAPI /meta_knowledge_graph endpoint
  • topic on format for querying with qualifiers (particular qualified_predicate)

@tokebe
Copy link
Member

tokebe commented Oct 20, 2022

Support for TRAPI KP responses complete, tested via debug value injection.

On to part 3...

@tokebe
Copy link
Member

tokebe commented Oct 20, 2022

With regard to qualifier constraints from queries, I'm a little concerned that current implementation will be both complicated and somewhat fuzzy:

  • I'm not currently aware of a concrete metakg structure to communicate what qualifiers might be entailed in an operation. As we have an x-bte example, I can at least filter BTE's metakg by that to limit unnecessary querying, however I don't see documentation nor examples for TRAPI metakg representations, so I currently can only apply constraints post-subquery for TRAPI KPs.
    • If it's fine to only pre-filter x-bte and post-filter TRAPI, then I should be able to proceed with this.
  • I'll need to upgrade BTE to work with Biolink 3 to at least some extent to properly make use of the qualifier hierarchy. This much should be a little more straightforward, though there may be some less-than-predictable issues

Tagging @andrewsu @colleenXu for input/opinions.

@colleenXu
Copy link
Collaborator

For your first point, yeah that's....not ideal. I'm okay with you going forward with a mix, since the other two types of constraints (at least edge-attribute source stuff) may have a similar mix of pre- and post-query filters.

I imagine that they will want qualifiers added to the /meta_knowledge_graph in future TRAPI releases...so then we'd have to change that a bit in the future.


For the second point, thanks for the heads-up. I'll keep in mind that there's the rest of biolink-model 3 stuff to do after/with the qualifiers for my x-bte annotation work.

@colleenXu
Copy link
Collaborator

colleenXu commented Oct 27, 2022

@tokebe @andrewsu

Step 1 (update x-bte annotations for qualifier-refactor) is done as of this commit. I did modify a few of Text-Mining's operations.

I ended up doing find-replace for semmeddb operations, and I haven't refactored the semmeddb-operation-generating code yet.

Next I'll work on updating x-bte annotations for the biolink 3.0 migration (not-qualifier-related). #514

EDIT: from this commit to this commit, the APIs with adjustments for the qualifier-refactor are:

  • DGIdb
  • go_bp
  • semmeddb
  • text-mining targeted

@colleenXu
Copy link
Collaborator

colleenXu commented Nov 2, 2022

Update on Step 1

Note that the format (aka case) of qualifier values is still unclear. Internal Slack link.


Noting that qualifier-refactor discussions haven't happened with Multiomics Provider for their APIs:

  • Multiomics BigGIM-DrugResponse KP API
  • Multiomics ClinicalTrials KP
  • Multiomics EHR Risk KP API
  • Multiomics Wellness KP API

Their APIs didn't come up as I was doing my find-replace for all the deprecated predicates.


And there's discussion with Text-Mining Provider + Translator Data Modeling group on what predicates/qualifiers to use for "chemical-regulates-gene" relationships (internal Slack link). Depending on how the conversation goes, we may want to revise those operations for Text-Mining's API and other APIs that have "regulates" relationships

@colleenXu
Copy link
Collaborator

colleenXu commented Nov 4, 2022

News:

  • Translator data modeling group confirmed that "regulates" should only be used in specific cases
    • many gene-gene relationships will count. Many chemical-gene relationships will be "affects" instead.
    • Quoting Mike Bada: "regulators as biologically evolved things, particularly (endogenous) genes/gene products, as we've defined regulation as biologically evolved things affecting other things for their proper functioning"
    • Internal Slack post links here, here
  • Translator data modeling group confirmed (DM with Sierra Moxon) that qualifier-values should be snake_case (and all "enum" values)
  • Informed Multiomics + Text-Mining Providers of the qualifier-refactor situation, how BioThings-APIs + BTE-as-TRAPI-wrapper works. Planning to leave these teams to update their x-bte annotation for the qualifiers refactor
  • Text-Mining Provider has their own branch for doing qualifier-refactor work. Jackson has connected the dev instance to this branch's yaml
    • They've opened a PR for an update ("regulates" vs "affects") and I've provided initial feedback

X-bte annotation updates:

Update 2022-11-10: modified the code to generate semmeddb operations with the qualifier/biolink3 mappings we're currently doing.

@colleenXu
Copy link
Collaborator

@tokebe Added a commit to query-handler (add-qualifiers branch) to adjust templates for the predicate changes biothings/bte_trapi_query_graph_handler@3a0ca1a. I haven't tested this change yet.

I'm also not sure about adding qualifier-constraints

@tokebe
Copy link
Member

tokebe commented Nov 7, 2022

Once I have the implementation, I see no reason that you couldn't use qualifier-constraints. You can add them now if you want, since they won't break anything.

@tokebe
Copy link
Member

tokebe commented Nov 7, 2022

@colleenXu I've pushed an experimental implementation of qualifier constrain pre-filtering. Below is the rundown of the behavior:

  • The following pre-filtering is only applied if an operation doesn't have the bte-trapi tag, which is added by BTE when TRAPI ingesting operations. If the operation has the bte-trapi tag, it is kept (because the qualifier constraints will be passed in as part of the TRAPI subquery)
  • For each operation, if at least one of the qualifier constraint sets has all of its qualifiers present/matching in the operation, the operation is kept.
  • If the given qualifier constraint set all have matches in the operation, but the operation has additional qualifiers, the operation is kept
  • All operations 'kept' are subject to subsequent filtering passes, such as predicate, input/output IDs, etc.

If this behavior looks good, could you run a few tests to make sure the implementation is behaving as expected (I've run some preliminary tests and it looks fine)? I'll be working on ensuring everything's set for adding in biolink 3.

@colleenXu
Copy link
Collaborator

@tokebe
Copy link
Member

tokebe commented Nov 7, 2022

I'm aware of this set of rules. My implementation follows these; however these rules are very broad and do not cover the specifics of qualifier-constraint matching for a KP, which we must follow for our non-TRAPI KPs.

These implementation rules only specify "is compatible with", which I've interpreted both to the best of my ability and with my understanding from previous conversations. I'm asking if you could look over the rundown I wrote and confirm that that specific set of cases looks fine.

Biolink-3 implementation seems relatively straightforward with only some test fixes. @andrewsu does it make sense to leave hierarchy traversal for qualifiers (and qualifier-constraint pre-filtering) for another issue?

@andrewsu
Copy link
Member Author

andrewsu commented Nov 7, 2022

@andrewsu does it make sense to leave hierarchy traversal for qualifiers (and qualifier-constraint pre-filtering) for another issue?

Yes, I think that makes sense. This issue is already getting a little bit unwieldy, so breaking things down to a series of more focused issues seems like a good plan...

@tokebe
Copy link
Member

tokebe commented Nov 7, 2022

@colleenXu I'm rolling the biolink 3.0.3 rollout into the qualifiers issue given that they're both very related and the biolink 3 rollout won't require a lot of work.

What's left (some may be put in separate issues):

  • additional testing for qualifier constraints
  • write tests for qualifier support
  • implement biolink 3 qualifier hierarchy traversal for qualifier constraints -> new issue

Do we want to somehow add qualifiers to our /meta_knowledge_graph endpoint? There hasn't been any guidance on this from Translator that I'm aware of but I imagine it's something that will be expected eventually...

@colleenXu
Copy link
Collaborator

@tokebe I think it makes sense to roll the #514 and this qualifier issue into the same set of branches/PRs.

I plan to write up how I manually test qualifier-constraints - which may help for writing tests

I think the qualifier-value hierarchy traversal would definitely be a separate issue.

I don't think we need to add qualifiers to our /meta_knowledge_graph endpoint for this issue because we don't know the format yet (from TRAPI) and we're not leading the effort there...

@colleenXu
Copy link
Collaborator

colleenXu commented Nov 9, 2022

Feedback Part 1

Update 2022-11-15: with this code fix, the behavior with 1 set of 1 constraint looks mostly fixed. There is still an issue with qualified_predicate (described in the later comment).

Qualifier constraints doesn't seem to work: compare the first and second queries below

Testing setup

  • add-qualifier branches for
    • biolink-model
    • api-response-transform
    • bte-trapi (probably optional - I remember making this branch)
    • call-apis
    • query_graph_handler
    • smartapi-kg
  • checkout bte-trapi's dev branch's src/config/smartapi_overrides.json file. It has all the API overrides to use biolink3/qualifier-updated SmartAPI annotations
  • run API_OVERRIDE=true npm run smartapi_sync to get all the API overrides

First set of testing

I used DGIdb's x-bte annotation that has qualifiers as a source. And I ran my queries restricted to using DGIdb only (link has v1/smartapi/e3edd325c76f2992a111b43a907a4870/query).

First: as a baseline - no qualifier_constraints
query
{
    "message": {
        "query_graph": {
            "nodes": {
                "n0": {
                    "ids":["NCBIGene:9132"],
                    "categories":["biolink:Gene"]
                },
                "n1": {
                    "categories":["biolink:SmallMolecule"]
               }
            },
            "edges": {
                "e1": {
                    "subject": "n0",
                    "object": "n1"
                }
            }
        }
    }
}

KCNQ4.txt

Currently, I get 9 results with 11 Edges:

  • 1 result Celecoxib has only 1 Edge that doesn't have qualifiers (predicate is interacts_with)
  • 5 results have Edges with qualifiers decreased activity molecular_channel_blockage: Guanidine hydrochloride, Nerispirdine, 4-Aminopyridine, TEDISAMIL, Bepridil
  • 1 result Linopirdine has Edge with qualifiers decreased activity_or_abundance inhibition
  • 1 result Flindokalner has Edge with qualifiers increased activity activation
  • 1 result Retigabine has 3 Edges (one with interacts_with, one with the decreased activity molecular_channel_blockage qualifiers and one with the increased activity activation).
Second: 1 set of 1 constraint
query
{
    "message": {
        "query_graph": {
            "nodes": {
                "n0": {
                    "ids":["NCBIGene:9132"],
                    "categories":["biolink:Gene"]
                },
                "n1": {
                    "categories":["biolink:SmallMolecule"]
               }
            },
            "edges": {
                "e1": {
                    "subject": "n0",
                    "object": "n1",
                    "qualifier_constraints": [
                        {
                            "qualifier_set": [
                                {
                                    "qualifier_type_id": "biolink:subject_direction_qualifier",
                                    "qualifier_value": "increased"
                                }
                            ]
                        }
                    ]
                }
            }
        }
    }
}

Compared to the first query, I'd expect to get only 2 of the results, Flindokalner and Retigabine, since both had edges with the qualifiers increased activity activation.

Instead, I'm getting 8 results (all the results with "affected_by" as the predicate).

@colleenXu
Copy link
Collaborator

colleenXu commented Nov 11, 2022

Feedback Part 2

Update 2022-11-15: with this code fix, the behavior looks fixed. Yay!

The "reversing" of qualifiers to match the QEdge direction isn't working as-expected. I'm running my queries restricted to using DGIdb only (link has v1/smartapi/e3edd325c76f2992a111b43a907a4870/query).

If I run a query like Gene -(affected_by)-> SmallMolecule X, I expect BTE to "reverse" the edge for execution (so it can start with the SmallMolecule X ID).

BTE will then run the sub-query SmallMolecule X -(affects)-> Gene. I know DGIdb's x-bte annotation will have the "causes" qualified_predicate and "object_*" qualifiers in the sub-query responses.

query to test correct reversal
{
    "message": {
        "query_graph": {
            "nodes": {
                "n0": {
                    "ids":["CHEMBL.COMPOUND:CHEMBL730"],
                    "categories":["biolink:SmallMolecule"],
                    "name": "NITROGLYCERIN"
                },
                "n1": {
                    "categories":["biolink:Gene"]
               }
            },
            "edges": {
                "e1": {
                    "subject": "n1",
                    "object": "n0",
                    "predicates": ["biolink:affected_by"]
                }
            }
        }
    }
}

Afterwards, BTE should reverse the predicate (affects) and the qualifiers for the Edges, so the Edges in the response have the predicate "affected_by" (same direction as QEdge), "caused_by" as the qualified_predicate, and "subject_*" qualifiers.

However, the current code doesn't correctly reverse the qualifiers. It does correctly reverse the predicate.

Here's the response I get to the query:
example_reverse.txt

@colleenXu
Copy link
Collaborator

colleenXu commented Nov 11, 2022

Feedback Part 3

Update 2022-11-15: with this code fix, (2) seems to be working. (1) isn't fully working - there are still issues with qualified_predicate (listed in a later comment)

I deleted a post and edited headings for the two posts above, to make these main concerns clearer. I'm also organizing the various things that have come up, in this post.

Here are some smaller features we want after recent discussions:

  1. when qualifier_type_id is qualified_predicate, the qualifier_value should have a biolink-prefix ("biolink:causes" or "biolink:caused_by"). This is for qualifiers in Edges and qualifier_constraints in QEdges.
    • @tokebe 2022-11-11 said we can do this in code, so we don't have to add the biolink-prefix to x-bte annotation (similar to how we treat predicates + node categories)
  2. query-validation:
    • UPDATE 2022-11-14 6pm PT, after discussion with Andrew: this is optional. We don't have to worry about query-validation because we should expect queries to the guidelines set by Data-Modeling/TRAPI groups.
    • "within 1 qualifier_set, each object must have a unique value for the qualifier_type_id field". This is what I think point 1i of this doc means.
      • If there are objects in a set with the same qualifier_type_id, I guess we'd want BTE to not execute further and return an informative error (there are two qualifier_constraints in the same set that have the same qualifier_type_id; please modify the query so each qualifier_type_id in a set is unique)

UPDATE 2022-11-14 6pm PT, after discussion with Andrew:

Don't worry about TRAPI KPs and how they'd handle "multiple curies" + "qualifier constraints":

  • TRAPI KPs should fail gracefully if they can't handle this, and BTE should be able to handle failures from sub-queries to TRAPI KPs

We're also asking for clarification...
According to the TRAPI 1.3.0 spec, qualifier_constraints are not supposed to be on a QEdge when the QNodes of that specific QEdge have multiple curies. BTE's implementation will likely handle multiple curies + qualifier_constraints without issues...but this may have consequences for how we do TRAPI-API sub-querying with qualifier_constraints:

  • Do we have to send IDs one-at-a-time when that hop has qualifier_constraints?
  • Do we have to send the query without qualifier_constraints, and then do post-filtering with the qualifier_constraints?

Some notes:

  • Right now, qualifier_value values that are enums from biolink-model DO NOT need the biolink-prefix
  • Theoretically you could query with qualifier_constraints and without specifying a predicate. I haven't seen any requirement forbidding this. BTE's implementation will likely be able to handle this.
  • According to the TRAPI 1.3.0 spec, qualifier_constraints are not supposed to be on a QEdge in the following situations. However, @tokebe and I think BTE can actually handle these. We don't see "implementation rules" on how to respond to these situations (like stopping execution / error messages).
    • that specific QEdge has multiple predicates specified.
      • This isn't an issue for TRAPI-API sub-querying because we only query 1 operation at a time (which has only 1 predicate)
    • the QNodes of that specific QEdge have multiple categories specified.
      • This situation comes up regularly due to the difference between the categories returned by NodeNormalizer and the categories specified on the QNodes (by the user).
      • And this isn't an issue for TRAPI-API sub-querying because we only query 1 operation at a time (which has only 1 category for subject and 1 for object)

Future issues from this "requirements" doc:

  • qualifier_value's value can be an enum from the biolink-model, and we're expected to "expand" qualifier_constraint terms using the enum hierarchy (for example, querying for activity_or_abundance should bring back Edges that have expression)
  • qualifier_value's value can also be an ontology-term (curie) and we're expected to "expand" qualifier_constraint terms using the ontology...
    • It's not clear if only specific ontologies can be the value, for specific qualifier_type_ids...

@tokebe
Copy link
Member

tokebe commented Nov 15, 2022

@colleenXu I've pushed fixes to each respective package.

  • Fixed constraints. For some reason one change required to pass constraints into the filter criteria didn't get committed
  • Fixed reversing. Reversed records weren't building using the newly reversed qualifiers.
  • Fixed biolink: prefix missing
  • Added validation for duplicate qualifier types within a set.

@colleenXu
Copy link
Collaborator

colleenXu commented Nov 16, 2022

Feedback round 2

EDIT: done retesting. From my testing POV, there were only 2 small things with qualified_predicate, which are probably related and easily fixed. Once these are addressed, this issue (Steps 1-3) may be complete...

Qualified predicates

Note: Testing setup is the same as above (aka using only DGIdb at the moment).)

  1. Qualified_predicate in a qualifier_constraint is not working when the value (predicate) includes the biolink-prefix (see example query below), which is the format we're supposed to follow. Currently, BTE finds no matching metaKG edges.
    I notice that if I provide the qualified predicate without the biolink-prefix, it works.
    ATP_1.txt
correct qualified_predicate-format example
{
    "message": {
        "query_graph": {
            "nodes": {
                "n0": {
                    "ids":["CHEMBL.COMPOUND:CHEMBL14249"],
                    "categories":["biolink:SmallMolecule"],
                    "name": "ATP"
                },
                "n1": {
                    "categories":["biolink:Gene"]
               }
            },
            "edges": {
                "e1": {
                    "subject": "n0",
                    "object": "n1",
                    "qualifier_constraints": [
                        {
                            "qualifier_set": [
                                {
                                    "qualifier_type_id": "biolink:qualified_predicate",
                                    "qualifier_value": "biolink:causes"
                                }
                            ]
                        }
                    ]
                }
            }
        }
    }
}
  1. BTE isn't finding metaKG edges when I test the qualified_predicate constraint with a QEdge that'll need reversal, even when the predicate biolink-prefix is removed:
"reversed QEdge" version of the query above

Look carefully at what the QEdge subject and object are (Gene -> ATP) and the qualified predicate.

{
    "message": {
        "query_graph": {
            "nodes": {
                "n0": {
                    "ids":["CHEMBL.COMPOUND:CHEMBL14249"],
                    "categories":["biolink:SmallMolecule"],
                    "name": "ATP"
                },
                "n1": {
                    "categories":["biolink:Gene"]
               }
            },
            "edges": {
                "e1": {
                    "subject": "n1",
                    "object": "n0",
                    "qualifier_constraints": [
                        {
                            "qualifier_set": [
                                {
                                    "qualifier_type_id": "biolink:qualified_predicate",
                                    "qualifier_value": "biolink:caused_by"
                                }
                            ]
                        }
                    ]
                }
            }
        }
    }
}

Like the query above, there should be 15 results when the metaKG edges are found.


Correction #1: Having two constraints in the same set is working.

It's expected to work as "AND" (both conditions must be met on one metaKG edge).

compare query with 1 constraint vs 2 constraints

First, the query w/ 1 constraint has 3 results. One of the results, TRPM4, has two edges - one for "activity" and one for "activity_or_abundance".
ATP_decreased.txt

1 constraint
{
    "message": {
        "query_graph": {
            "nodes": {
                "n0": {
                    "ids":["CHEMBL.COMPOUND:CHEMBL14249"],
                    "categories":["biolink:SmallMolecule"],
                    "name": "ATP"
                },
                "n1": {
                    "categories":["biolink:Gene"]
               }
            },
            "edges": {
                "e1": {
                    "subject": "n0",
                    "object": "n1",
                    "qualifier_constraints": [
                        {
                            "qualifier_set": [
                                {
                                    "qualifier_type_id": "biolink:object_direction_qualifier",
                                    "qualifier_value": "decreased"
                                }
                            ]
                        }
                    ]
                }
            }
        }
    }
}

Compare it to the response with 2 constraints (AND). While it also has 3 results, TRPM4, only has 1 edge - the one for "activity" to match the 2nd constraint.
ATP_decreased_activity.txt

1 constraint
{
    "message": {
        "query_graph": {
            "nodes": {
                "n0": {
                    "ids":["CHEMBL.COMPOUND:CHEMBL14249"],
                    "categories":["biolink:SmallMolecule"],
                    "name": "ATP"
                },
                "n1": {
                    "categories":["biolink:Gene"]
               }
            },
            "edges": {
                "e1": {
                    "subject": "n0",
                    "object": "n1",
                    "qualifier_constraints": [
                        {
                            "qualifier_set": [
                                {
                                    "qualifier_type_id": "biolink:object_direction_qualifier",
                                    "qualifier_value": "decreased"
                                },
                                {
                                    "qualifier_type_id": "biolink:object_aspect_qualifier",
                                    "qualifier_value": "activity"
                                }
                            ]
                        }
                    ]
                }
            }
        }
    }
}

Correction #2: Having 2 sets of constraints (1 in each) is working.

It's expected to work as "OR" (at least one condition must be met on one metaKG edge).

example with 2 sets of constraints
{
    "message": {
        "query_graph": {
            "nodes": {
                "n0": {
                    "ids":["CHEMBL.COMPOUND:CHEMBL14249"],
                    "categories":["biolink:SmallMolecule"],
                    "name": "ATP"
                },
                "n1": {
                    "categories":["biolink:Gene"]
               }
            },
            "edges": {
                "e1": {
                    "subject": "n0",
                    "object": "n1",
                    "qualifier_constraints": [
                        {
                            "qualifier_set": [
                                {
                                    "qualifier_type_id": "biolink:object_direction_qualifier",
                                    "qualifier_value": "increased"
                                }
                            ]
                        },
                        {
                            "qualifier_set": [
                                {
                                    "qualifier_type_id": "biolink:object_direction_qualifier",
                                    "qualifier_value": "decreased"
                                }
                            ]
                        }
                    ]
                }
            }
        }
    }
}

This will return 15 results - some with increased and some with decreased qualifiers.


Useful basic queries for qualifier-constraints:

  • KCNQ4 NCBIGene:9132 -(affected_by)-> chem
  • IGF1R NCBIGene:3480 -(affected_by)-> chem
  • RYR2 NCBIGene:6262 -(affected_by)-> chem
  • ATP CHEMBL14249 -(affects)-> Gene
  • Quetiapine CHEMBL716 -(affects)-> Gene

Each of these has at least 3 kinds of qualifier-combos (increased activity, decreased activity, decreased activity_or_abundance)

@tokebe
Copy link
Member

tokebe commented Dec 7, 2022

@colleenXu I pushed fixes for these, currently working on fixing any broken tests (most failing tests on the query_graph_handler are due to changes causing qEdge hashes to change).

Edit: there are still a few more minor fixes to be made, will update when everything's ready for another round of testing.

@colleenXu
Copy link
Collaborator

colleenXu commented Dec 9, 2022

Tested the latest code for qualified_predicate handling. Now it works, whether the value has the biolink prefix or not.

Did some quick testing of other queries with qualified predicates -> these still work as well.

@tokebe
Copy link
Member

tokebe commented Dec 9, 2022

Alright, I think I can leave it to a future validation issue for not supporting non-prefixed values, if we care about that.

@colleenXu
Copy link
Collaborator

Don't think we care about it :P

@tokebe
Copy link
Member

tokebe commented Mar 28, 2023

Deployed to prod 🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants