phase 1: processing TRAPI-1.4 KP sub-query responses (aux-graph/result.analyses refactor) #614

colleenXu · 2023-04-12T05:55:14Z

Background

info on TRAPI 1.4 KP aux-graph / result.analyses expectations in phase 1: aux-graph/result.analyses refactor for basic querying #603
provenance refactoring when ingesting TRAPI KP edges is covered in phase 1: provenance refactor for edges from some multiomics KPs, text-mining KP, TRAPI KPs #617
For now, we can continue our practice of ignoring the KP's scoring....because a KP result's analyses.support_graphs should be related to its scoring, we can ignore it
we can ignore result.analyses.resource_id

Overview

I use this sub-query for my example results below

{
    "message": {
        "query_graph": {
            "nodes": {
                "n0": {
                    "ids":["MONDO:0005377"],
                    "categories":["biolink:DiseaseOrPhenotypicFeature"],
                    "name": "noonan"
                },
                "n1": {
                    "categories":["biolink:Gene"]
                }
            },
            "edges": {
                "eA": {
                    "subject": "n0",
                    "object": "n1",
                    "predicates": ["biolink:caused_by"]
                }
            }
        }
    }
}

Our sub-queries to TRAPI KPs are 1-hop Predict "style" TRAPI queries with batches of IDs sent in each request.

We expect two kinds of results in the TRAPI-1.4 responses (mirrors the two scenarios of #603)...

no ID/node-expansion was involved

we expect 1 analysis object in result.analyses, so BTE can take that object's edge_bindings and they should be just like the result.edge_bindings in TRAPI 1.3
the edge(s?) there should be "flat", meaning they won't reference an auxiliary graph (aka they won't have an element in the attributes array where the attribute_type_id is "biolink:support_graphs")

example result

This is a "fake" expected result that isn't based on a real response from a TRAPI-1.4 KP...

            {
                "node_bindings": {
                    "n0": [
                        {
                            "id": "MONDO:0005377"
                        }
                    ],
                    "n1": [
                        {
                            "id": "NCBIGene:3315"
                        }
                    ]
                },
                "analyses": [
                    {
                        "resource_id": "infores:automat-biolink",
                        "edge_bindings": {
                            "eA": [
                                {
                                    "id": "54d9ed32bec4d12369592709e20c997f"
                                }
                            ]
                        }
                        "score": 0.8
                    }
                ]
            }

ID/node-expansion was involved

READ THIS FIRST:

This is still being discussed by the TRAPI team / Translator
the info below is based on TRAPI 1.4.0-beta3 and the discussions Jackson and I had on this topic

Notes:

we still expect 1 analysis object in result.analyses. But the edge(s)? in that object's edge_bindings may reference an auxiliary graph. When this happens, we expect 1 element in the attributes array where the attribute_type_id is "biolink:support_graphs". The value of that Attribute object should be 1 or more keys for auxiliary-graphs...
we could decide to drop these results! (if we want data for descendant IDs, we'll include them in the batch of IDs we send and ideally get edges back with no auxiliary graph references)
UPDATED 2023-04-26 discussion: if we want to keep the edges, but ignore/remove the edge-attribute with the aux-graph (since we'll "drop" that), we may want to generate a warning-level log so we know this is happening.
If we want to process these edges with auxiliary graph references, we'll want to:
- get the referenced auxiliary-graph objects. We want those in the auxiliary_graphs section of our TRAPI response
  - implementation musing: may need to rename aux-graph key to keep it unique?
- get the edges listed in those auxiliary-graph objects. We want those in the knowledge_graph.edges section of our TRAPI response
- check if this set of edges reference auxiliary-graphs (will be in their attributes, same as before). If they do...repeat the two steps above (implementation musing: recursive behavior?)
when doing the next parts of query-execution, I imagine we'd use the main edge(s) from the result.analyses.edge_bindings. So we'd basically ignore the nested auxiliary-graphs and their edges...

Examples:

slides from phase 1: aux-graph/result.analyses refactor for basic querying #603 (ignore the second-hop aka QEdge eB)
slides from this post

The text was updated successfully, but these errors were encountered:

colleenXu · 2023-04-12T06:00:40Z

Specifically interested in @tokebe's view of this idea of dropping results when the analyses.edge_binding edges reference aux-graphs...

we could decide to drop these results! (if we want data for descendant IDs, we'll include them in the batch of IDs we send and ideally get edges back with no auxiliary graph references)

colleenXu · 2023-04-13T20:01:49Z

Note that COHD's dev instance seems to be on TRAPI 1.4 (we can access it through the registration we currently use, but they also registered a separate yaml for TRAPI 1.4)

However, I haven't checked their /query responses to see if they are using the aux-graph/result.analyses as we expect, and whether we can use it to develop and test our code for this issue...

From my post here: #597 (comment)

don't deploy this until #614 is addressed aka BTE can handle TRAPI 1.4 KP responses other code may be needed as well

text-mining targeted and multiomics clinicaltrials. to support trapi 1.4 sources data ingest

colleenXu · 2023-05-11T06:42:18Z

Deleted the previous comment (oops? should have edited or hidden it instead?).

@tokebe and I agreed to adjust the API_LIST config file rather than use SmartAPI overrides, because the names of the APIs were different between registrations.

The adjustments are in this branch main...trapi1-4-overrides and include...

COHD has a second registration for TRAPI 1.4 instances (2023-05-19: created a new one with dev + CI)
Automat KPs have second registrations for TRAPI 1.4 instances (only dev right now). However, some tools that we previously used are missing or don't have TRAPI 1.4 instances. Those have been removed in the API_LIST config file for the main branch (TRAPI 1.3 instances) and this branch
2023-05-19: Connections Hypothesis Provider has a second registration for TRAPI 1.4 instances (only dev right now)

tokebe · 2023-05-11T20:18:06Z

Above linked PR currently drops all KP result edges that have support graphs, per 1-on-1 discussion with @colleenXu. This might change if there's a good case in which we'd want to keep support graphs.

Note that support graphs on the result analysis are also not kept, but not used as criteria for dropping a result edge (these support graphs would typically explain result scoring, which we also don't use from TRAPI KPs).

colleenXu · 2023-05-31T05:29:25Z

Note that I'm not sure that CHP's TRAPI 1.4 instance is working (dev only; http://chp.thayer.dartmouth.edu/query). When querying it directly, I'm getting either an empty response or a malformed error response. Looks like BTE is handling this somewhat...but I dunno if it could handle it more gracefully / intelligently?

BTE log example:

        {
            "timestamp": "2023-05-30T20:10:35.710Z",
            "level": "ERROR",
            "message": "call-apis: Failed POST http://chp.thayer.dartmouth.edu (1 ID): Gene > expressed_in > GrossAnatomicalStructure: (TypeError: Cannot read properties of undefined (reading 'id'))",
            "code": null
        },

query 1

{
    "message": {
        "query_graph": {
            "nodes": {
                "n0": {
                    "ids": ["NCBIGene:672"],
                    "categories": ["biolink:Gene"]
                },
                "n1": {
                    "categories": [
                        "biolink:GrossAnatomicalStructure"
                    ]
                }
            },
            "edges": {
                "e0": {
                    "subject": "n0",
                    "object": "n1",
                    "predicates": ["biolink:expressed_in"]
                }
            }
        }
    }
}

response to query 1: empty KG/results

{
    "message": {
        "query_graph": {
            "nodes": {
                "n0": {
                    "ids": [
                        "NCBIGene:672"
                    ],
                    "categories": [
                        "biolink:Gene"
                    ],
                    "is_set": false,
                    "constraints": []
                },
                "n1": {
                    "ids": null,
                    "categories": [
                        "biolink:GrossAnatomicalStructure"
                    ],
                    "is_set": false,
                    "constraints": []
                }
            },
            "edges": {
                "e0": {
                    "subject": "n0",
                    "object": "n1",
                    "knowledge_type": null,
                    "predicates": [
                        "biolink:expressed_in"
                    ],
                    "attribute_constraints": [],
                    "qualifier_constraints": []
                }
            }
        },
        "knowledge_graph": null,
        "results": [
            {
                "node_bindings": {
                    "n0": [],
                    "n1": []
                },
                "analyses": [
                    {
                        "resource_id": "infores:connections-hypothesis",
                        "edge_bindings": {
                            "e0": []
                        },
                        "score": null,
                        "support_graphs": null,
                        "scoring_method": null,
                        "attributes": null
                    }
                ]
            }
        ],
        "auxiliary_graphs": null
    },
    "logs": [
        {
            "timestamp": "2023-05-30T20:14:12.898845",
            "level": "INFO",
            "message": "Running message.",
            "code": null
        },
        {
            "timestamp": "2023-05-30T20:14:12.898853",
            "level": "INFO",
            "message": "Getting message templates.",
            "code": null
        },
        {
            "timestamp": "2023-05-30T20:14:12.898917",
            "level": "INFO",
            "message": "Checking template matches for gene_specificity",
            "code": null
        },
        {
            "timestamp": "2023-05-30T20:14:12.900685",
            "level": "INFO",
            "message": "Detected 1 matches for gene_specificity",
            "code": null
        },
        {
            "timestamp": "2023-05-30T20:14:12.900690",
            "level": "INFO",
            "message": "Constructing queries on matching templates",
            "code": null
        },
        {
            "timestamp": "2023-05-30T20:14:12.900972",
            "level": "INFO",
            "message": "Sending 1 consistent queries",
            "code": null
        },
        {
            "timestamp": "2023-05-30T20:14:12.905156",
            "level": "INFO",
            "message": "Wildcard detected",
            "code": null
        },
        {
            "timestamp": "2023-05-30T20:14:12.905312",
            "level": "INFO",
            "message": "Received responses from gene_specificity",
            "code": null
        }
    ],
    "trapi_version": "1.4",
    "biolink_version": "3.1.2",
    "status": "Success",
    "id": "2adeb7ba-6b70-429b-97b1-384f8e9c80f1",
    "workflow": [
        {
            "id": "lookup"
        }
    ]
}

query 2

uses an ID they list in the example response of the /curies endpoint

{
    "message": {
        "query_graph": {
            "nodes": {
                "n0": {
                    "ids": ["ENSEMBL:ENSG00000106665"],
                    "categories": ["biolink:Gene"]
                },
                "n1": {
                    "categories": [
                        "biolink:GrossAnatomicalStructure"
                    ]
                }
            },
            "edges": {
                "e0": {
                    "subject": "n0",
                    "object": "n1",
                    "predicates": ["biolink:expressed_in"]
                }
            }
        }
    }
}

query 2 response: malformed error

very long, only including snippets that seem useful

Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/django/core/handlers/exception.py", line 55, in inner
    response = get_response(request)
  File "/usr/local/lib/python3.8/site-packages/django/core/handlers/base.py", line 197, in _get_response
    response = wrapped_callback(request, *callback_args, **callback_kwargs)
  File "/usr/local/lib/python3.8/site-packages/django/views/decorators/csrf.py", line 56, in wrapper_view
    return view_func(*args, **kwargs)
  File "/usr/local/lib/python3.8/site-packages/django/views/generic/base.py", line 104, in view
    return self.dispatch(request, *args, **kwargs)
  File "/usr/local/lib/python3.8/site-packages/rest_framework/views.py", line 509, in dispatch
    response = self.handle_exception(exc)
  File "/usr/local/lib/python3.8/site-packages/rest_framework/views.py", line 469, in handle_exception
    self.raise_uncaught_exception(exc)
  File "/usr/local/lib/python3.8/site-packages/rest_framework/views.py", line 480, in raise_uncaught_exception
    raise exc
  File "/usr/local/lib/python3.8/site-packages/rest_framework/views.py", line 506, in dispatch
    response = handler(request, *args, **kwargs)
  File "/home/chp_api/web/dispatcher/views.py", line 45, in post
    return dispatcher.get_response(message)
  File "/home/chp_api/web/dispatcher/base.py", line 167, in get_response
    responses = get_app_response_fn(consistent_app_queries, self.logger)
  File "/usr/local/lib/python3.8/site-packages/gene_specificity/app_interface.py", line 25, in get_response
    response = interface.get_response(consistent_query, logger)
  File "/usr/local/lib/python3.8/site-packages/gene_specificity/trapi_interface.py", line 142, in get_response
    self._add_results(message, subject_mapping, qg_subject_id, [curie], subject_category, predicate, qg_edge_id, object_mapping, qg_object_id, object_curies, object_category, vals)

Exception Type: TypeError at /query
Exception Value: _add_results() missing 2 required positional arguments: &#x27;object_category&#x27; and &#x27;vals&#x27;

	<div id="explanation">
		<p>
			You’re seeing this error because you have <code>DEBUG = True</code> in your
			Django settings file. Change that to <code>False</code>, and Django will
			display a standard page generated by the handler for this status code.
		</p>
	</div>

</body>

</html>

EDIT: I found a query that works. However, (a) BTE wouldn't send a sub-query like this (where the ID is the object) and (b) BTE may not be able to process the response (only 1 result that contains all 30 "answers", as ifis_set: true was on the Gene QNode...)

query that works

This is the example given for their /query endpoint

{
    "message": {
        "query_graph": {
            "nodes": {
                "n0": {
                    "categories": ["biolink:Gene"]
                },
                "n1": {
                    "ids": ["UBERON:0009835"],
                    "categories": ["biolink:GrossAnatomicalStructure"]
                }
            },
            "edges": {
                "e0": {
                    "subject": "n0",
                    "object": "n1",
                    "predicates": ["biolink:expressed_in"]
                }
            }
        }
    }
}

query response:
response2.txt

tokebe · 2023-08-03T17:11:19Z

Marking this one as done -- we'll treat the above as a new issue (tracked in #685)

colleenXu · 2023-08-03T18:07:39Z

Note that other tools in Translator aren't doing subclassing w/ aux-graphs right now (it's an after-Sept goal). So...we'll open a new issue if we notice any issues processing their KP responses or we want to change our behavior...

colleenXu added the trapi 1.4 label Apr 12, 2023

colleenXu changed the title ~~phase 2: processing TRAPI-1.4 KP sub-query responses~~ phase 2: processing TRAPI-1.4 KP sub-query responses (aux-graph/result.analyses refactor) Apr 12, 2023

colleenXu changed the title ~~phase 2: processing TRAPI-1.4 KP sub-query responses (aux-graph/result.analyses refactor)~~ phase 1: processing TRAPI-1.4 KP sub-query responses (aux-graph/result.analyses refactor) Apr 12, 2023

colleenXu mentioned this issue Apr 12, 2023

overview and management of TRAPI 1.4 features #613

Closed

15 tasks

colleenXu mentioned this issue Apr 12, 2023

phase 1: provenance refactor for edges from some multiomics KPs, text-mining KP, TRAPI KPs #617

Closed

colleenXu mentioned this issue May 10, 2023

text_mining_targeted_association data and parser update biothings/pending.api#110

Closed

colleenXu added a commit that referenced this issue May 11, 2023

fix: trapi kp overrides

67aacb5

don't deploy this until #614 is addressed aka BTE can handle TRAPI 1.4 KP responses other code may be needed as well

colleenXu referenced this issue May 11, 2023

fix: trapi 1.4 overrides

fb1a2c7

text-mining targeted and multiomics clinicaltrials. to support trapi 1.4 sources data ingest

colleenXu referenced this issue May 11, 2023

fix: only ingest trapi 1.4 kps during cron job

74af72c

tokebe mentioned this issue May 11, 2023

Handle TRAPI 1.4 response format from TRAPI KPs biothings/api-respone-transform.js#50

Merged

tokebe mentioned this issue Aug 3, 2023

BTE failing some subqueries with code error #685

Closed

tokebe closed this as completed Aug 3, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

phase 1: processing TRAPI-1.4 KP sub-query responses (aux-graph/result.analyses refactor) #614

phase 1: processing TRAPI-1.4 KP sub-query responses (aux-graph/result.analyses refactor) #614

colleenXu commented Apr 12, 2023 •

edited

Loading

colleenXu commented Apr 12, 2023

colleenXu commented Apr 13, 2023

colleenXu commented May 11, 2023 •

edited

Loading

tokebe commented May 11, 2023

colleenXu commented May 31, 2023 •

edited

Loading

tokebe commented Aug 3, 2023

colleenXu commented Aug 3, 2023

phase 1: processing TRAPI-1.4 KP sub-query responses (aux-graph/result.analyses refactor) #614

phase 1: processing TRAPI-1.4 KP sub-query responses (aux-graph/result.analyses refactor) #614

Comments

colleenXu commented Apr 12, 2023 • edited Loading

Background

Overview

no ID/node-expansion was involved

ID/node-expansion was involved

colleenXu commented Apr 12, 2023

colleenXu commented Apr 13, 2023

colleenXu commented May 11, 2023 • edited Loading

tokebe commented May 11, 2023

colleenXu commented May 31, 2023 • edited Loading

tokebe commented Aug 3, 2023

colleenXu commented Aug 3, 2023

colleenXu commented Apr 12, 2023 •

edited

Loading

colleenXu commented May 11, 2023 •

edited

Loading

colleenXu commented May 31, 2023 •

edited

Loading