-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
update modeling to reflect qualifier changes for chemical -> gene associations #512
Comments
Steps
|
Additional notes from my conversation with Sierra and Matt B.:
One interpretation of this record is the simple SPO triple --
|
Biolink model lays out "chemical-gene associations" starting line 10301-10399, 11467-11485 (physically_interacts_with, affects, regulates) For the possible values, it's not clear to me what the case is (snake_case?) UPDATE: they are supposed to be snake_case. Subject (and object qualifiers):
qualified_predicate
statement qualifier?
Possible aspect values:
Migration guide should have full list of deprecated predicates My thoughts:
|
Pasting from Slack: Also I realized that what we still have a "reversing" issue for Part B: we query in whatever direction we need to retrieve records using IDs. And then we "invert" the predicate so the record/Edge matches the QEdge direction... But now we would also need to "invert" some qualifiers (change all the "subject" to "object" and vice versa) o_0. |
I've pushed the branch Todo list:
|
Also to clarify, the that the predicate hierarchy (with deprecated predicates viewable at http://tree-viz-biolink.herokuapp.com/predicates/deprecated-predicates) will define the allowed values for both the |
Will likely want to update BTE to biolink-model 3.0.3 for Part 3 of qualifiers, so will want to be aware of this issue #514 |
Noting: qualifiers currently aren't included in the TRAPI /meta_knowledge_graph (which could affect Parts 2 and 3) and discussion is still active on the format for querying with qualifiers (Part 3). Related discussions (TRAPI 1.4 upcoming work and release): |
Support for TRAPI KP responses complete, tested via debug value injection. On to part 3... |
With regard to qualifier constraints from queries, I'm a little concerned that current implementation will be both complicated and somewhat fuzzy:
Tagging @andrewsu @colleenXu for input/opinions. |
For your first point, yeah that's....not ideal. I'm okay with you going forward with a mix, since the other two types of constraints (at least edge-attribute source stuff) may have a similar mix of pre- and post-query filters. I imagine that they will want qualifiers added to the /meta_knowledge_graph in future TRAPI releases...so then we'd have to change that a bit in the future. For the second point, thanks for the heads-up. I'll keep in mind that there's the rest of biolink-model 3 stuff to do after/with the qualifiers for my x-bte annotation work. |
Step 1 (update x-bte annotations for qualifier-refactor) is done as of this commit. I did modify a few of Text-Mining's operations. I ended up doing find-replace for semmeddb operations, and I haven't refactored the semmeddb-operation-generating code yet. Next I'll work on updating x-bte annotations for the biolink 3.0 migration (not-qualifier-related). #514 EDIT: from this commit to this commit, the APIs with adjustments for the qualifier-refactor are:
|
Update on Step 1Note that the format (aka case) of qualifier values is still unclear. Internal Slack link. Noting that qualifier-refactor discussions haven't happened with Multiomics Provider for their APIs:
Their APIs didn't come up as I was doing my find-replace for all the deprecated predicates. And there's discussion with Text-Mining Provider + Translator Data Modeling group on what predicates/qualifiers to use for "chemical-regulates-gene" relationships (internal Slack link). Depending on how the conversation goes, we may want to revise those operations for Text-Mining's API and other APIs that have "regulates" relationships |
News:
X-bte annotation updates:
Update 2022-11-10: modified the code to generate semmeddb operations with the qualifier/biolink3 mappings we're currently doing. |
@tokebe Added a commit to query-handler (add-qualifiers branch) to adjust templates for the predicate changes biothings/bte_trapi_query_graph_handler@3a0ca1a. I haven't tested this change yet. I'm also not sure about adding qualifier-constraints |
Once I have the implementation, I see no reason that you couldn't use qualifier-constraints. You can add them now if you want, since they won't break anything. |
@colleenXu I've pushed an experimental implementation of qualifier constrain pre-filtering. Below is the rundown of the behavior:
If this behavior looks good, could you run a few tests to make sure the implementation is behaving as expected (I've run some preliminary tests and it looks fine)? I'll be working on ensuring everything's set for adding in biolink 3. |
@tokebe I found this set of implementation rules on qualifier_constraints https://github.com/NCATSTranslator/ReasonerAPI/blob/v1.3.0/ImplementationRules.md#qedgequalifier_constraints |
I'm aware of this set of rules. My implementation follows these; however these rules are very broad and do not cover the specifics of qualifier-constraint matching for a KP, which we must follow for our non-TRAPI KPs. These implementation rules only specify "is compatible with", which I've interpreted both to the best of my ability and with my understanding from previous conversations. I'm asking if you could look over the rundown I wrote and confirm that that specific set of cases looks fine. Biolink-3 implementation seems relatively straightforward with only some test fixes. @andrewsu does it make sense to leave hierarchy traversal for qualifiers (and qualifier-constraint pre-filtering) for another issue? |
Yes, I think that makes sense. This issue is already getting a little bit unwieldy, so breaking things down to a series of more focused issues seems like a good plan... |
@colleenXu I'm rolling the biolink 3.0.3 rollout into the qualifiers issue given that they're both very related and the biolink 3 rollout won't require a lot of work. What's left (some may be put in separate issues):
Do we want to somehow add qualifiers to our /meta_knowledge_graph endpoint? There hasn't been any guidance on this from Translator that I'm aware of but I imagine it's something that will be expected eventually... |
@tokebe I think it makes sense to roll the #514 and this qualifier issue into the same set of branches/PRs. I plan to write up how I manually test qualifier-constraints - which may help for writing tests I think the qualifier-value hierarchy traversal would definitely be a separate issue. I don't think we need to add qualifiers to our /meta_knowledge_graph endpoint for this issue because we don't know the format yet (from TRAPI) and we're not leading the effort there... |
Feedback Part 1Update 2022-11-15: with this code fix, the behavior with 1 set of 1 constraint looks mostly fixed. There is still an issue with qualified_predicate (described in the later comment).
|
Feedback Part 2Update 2022-11-15: with this code fix, the behavior looks fixed. Yay!
|
Feedback Part 3Update 2022-11-15: with this code fix, (2) seems to be working. (1) isn't fully working - there are still issues with qualified_predicate (listed in a later comment)
UPDATE 2022-11-14 6pm PT, after discussion with Andrew: Don't worry about TRAPI KPs and how they'd handle "multiple curies" + "qualifier constraints":
Some notes:
Future issues from this "requirements" doc:
|
@colleenXu I've pushed fixes to each respective package.
|
Feedback round 2EDIT: done retesting. From my testing POV, there were only 2 small things with qualified_predicate, which are probably related and easily fixed. Once these are addressed, this issue (Steps 1-3) may be complete... Qualified predicatesNote: Testing setup is the same as above (aka using only DGIdb at the moment).)
correct qualified_predicate-format example
"reversed QEdge" version of the query aboveLook carefully at what the QEdge subject and object are (Gene -> ATP) and the qualified predicate.
Like the query above, there should be 15 results when the metaKG edges are found. Correction #1: Having two constraints in the same set is working.It's expected to work as "AND" (both conditions must be met on one metaKG edge). compare query with 1 constraint vs 2 constraintsFirst, the query w/ 1 constraint has 3 results. One of the results, TRPM4, has two edges - one for "activity" and one for "activity_or_abundance". 1 constraint
Compare it to the response with 2 constraints (AND). While it also has 3 results, TRPM4, only has 1 edge - the one for "activity" to match the 2nd constraint. 1 constraint
Correction #2: Having 2 sets of constraints (1 in each) is working.It's expected to work as "OR" (at least one condition must be met on one metaKG edge). example with 2 sets of constraints
This will return 15 results - some with Useful basic queries for qualifier-constraints:
Each of these has at least 3 kinds of qualifier-combos (increased activity, decreased activity, decreased activity_or_abundance) |
@colleenXu I pushed fixes for these, currently working on fixing any broken tests (most failing tests on the query_graph_handler are due to changes causing qEdge hashes to change). Edit: there are still a few more minor fixes to be made, will update when everything's ready for another round of testing. |
Tested the latest code for qualified_predicate handling. Now it works, whether the value has the biolink prefix or not. Did some quick testing of other queries with qualified predicates -> these still work as well. |
Alright, I think I can leave it to a future validation issue for not supporting non-prefixed values, if we care about that. |
Don't think we care about it :P |
Deployed to prod 🚀 |
This is a new issue to track our team's implementation of this issue on qualifiers for chemical-gene associations: NCATSTranslator/ReasonerAPI#348
From that issue:
According to the progress tracker, the target completion date for this issue is 2022-10-25
related/broader issue on constraints: #482
The text was updated successfully, but these errors were encountered: