[dataquery] Define new swagger schema for data query API #8219

driusan · 2022-11-07T16:47:56Z

This defines the schema I've been using to design a QueryEngine (rather than CouchDB) based data query tool.

The meaning of the 3 types of queries in the /queries endpoints are:

recent: There are no more "saved queries". Instead, all recent queries are "saved" and can be reloaded. Users can star or name a query in order to indicate that they're important queries that shouldn't be purged and duplicate the functionality of the old saved queries. (Currently, no queries are ever purged, but if we do change that in the future, starred ones are the equivalent of saved ones.)
shared: Shared queries, similar to existing shared queries. These queries are public for everyone.
topqueries: Admin queries that were pinned to the top of the DQT. Similar to shared queries, but pinned by an admin who think they're important.

I believe the rest should be explained by the swagger schema descriptions

modules/dataquery/static/schema.yml

driusan · 2022-12-13T16:45:36Z

I'm planning on sending a PR with the php-side implementation of this to make it easier to test, but it's going to be blocked by #8260 and #8216 (or at least one other module implementing the interface..)

This defines the schema I've been using to design a QueryEngine (rather than CouchDB) based data query tool.

modules/dataquery/static/schema.yml

xlecours · 2022-12-14T16:31:09Z

Following my comments, I think the recent, shared and top "tags" are concerns of the client and should not orient the response body shape. /queries should return all the queries accessible by the requesting user.

That said, Shared, Stared and potentially Pinned are properties that the backend should include in the responses.

driusan · 2022-12-16T15:38:26Z

@xlecours I think I've incorporated all the changes you requested but haven't had a chance to update the backend yet.

I didn't change the wording of the endpoint "queries" to say "all" because that might not scale if we get to a point where the recent queries displayed need to be limited.

modules/dataquery/static/schema.yml

xlecours · 2022-12-16T16:36:50Z

modules/dataquery/static/schema.yml

+        style: simple
+        schema:
+          type: integer  
+    get:


I think post would make more sense. This is creating a queryRun so I don't think it is safe (as in rfc2616 safe)

returning 201 Created with location header set to /queries/{QueryID}/run/{QueryRunID} would be good as well.

how do you denote "Location header" in swagger?

actually that can't be done right now because that end point is not implemented so it would mean it's impossible to get results. Once it is implemented, the schema could be updated.

'201': description: The visit was created successfully headers: Content-Location: description: The URL of the new visit. style: simple explode: false schema: type: string

xlecours · 2022-12-16T16:47:09Z

I didn't change the wording of the endpoint "queries" to say "all" because that might not scale if we get to a point where the recent queries displayed need to be limited.

I suggest you add /queryRun endpoint to get the run out of /queries. If there is too many run then implementing pagination would be an option then.

GET /queries
{
  "Queries": []
}

GET /queryRun
{
  "QueryRun": []
}

This describes what is POSTed and also goes directly into a subfield of a Query object, not directly into the query.

driusan · 2022-12-16T18:51:24Z

I don't think a QueryRuns endpoint wouldn't solve anything related to performance. Checking the access is expensive because it needs to go through every field and module of the query's fields and criteria. Sending it over the wire is cheap. A separate endpoint (or pagination) would just add overhead to loading the list.

xlecours · 2022-12-16T20:04:04Z

I don't think a QueryRuns endpoint wouldn't solve anything related to performance.

It would remove the burden of loading the runs along the queries when the client request the list of existing queries.

driusan · 2022-12-16T20:27:46Z

Sending query runs isn't a burden. It's sending a small amount of data directly from the database over the wire. Checking if a query is accessible by a user is slow, however, so, "all queries accessible by the user" is much slower than "queries which there is some reason to send to the user". The performance hit comes from the access check for queries, not the inclusion of a query run list.

xlecours · 2022-12-16T20:38:01Z

I don't know about the implementation.
Getting the queries and the queryRun must take A + B microseconds. Getting only the queries can't take longer than getting both.

driusan · 2022-12-16T20:58:15Z

Getting all queries accessible by the user takes n*accesschecktime seconds. There is no reason to include queries other users have run in the queries list just because they happen to have access to everything used in the query. That increases it to n*accesschecktime*numusers (roughly) seconds. The user only needs a subset of queries. The ones they've run, the ones that have been shared because they're interesting, the ones that an admin pinned, etc, not all queries on the server. Including all accessible queries makes the response unuseably slow. Including the times that they've been run has negligible impact on the performance.

xlecours · 2022-12-16T21:59:00Z

Ok I get this. But I don't see how including queryRun in the results of /queries would solve any of that.

driusan · 2022-12-16T22:06:46Z

It doesn't. Leaving the description "Get a list of a recent, shared, and study (top) queries to display." instead of "All queries accessible to the user" solves that.

Thinking about it a little more, I see how pagination could be useful for run queries. I'll separate them out on Monday.

driusan · 2022-12-19T15:39:59Z

@xlecours the queries and runs are split into two different endpoints now.

xlecours

Good!
The only thing I wasn'T sure about is QueryRun having a link to the Query. At this point, it will be a suggestion, not a change request.
I have eager to see the implementation.

driusan · 2022-12-19T16:02:47Z

Added QueryURI to the QueryRun object.

driusan · 2022-12-19T16:16:54Z

@xlecours the implementation is already in #8268, just updated it to the latest changes..

driusan · 2022-12-19T16:19:24Z

@maltheism you reviewed an earlier draft. Are you okay with this version of the schema?

driusan added the Area: API PR or issue related to the API label Nov 7, 2022

maltheism reviewed Nov 21, 2022

View reviewed changes

modules/dataquery/static/schema.yml Show resolved Hide resolved

laemtl assigned xlecours Dec 13, 2022

[dataquery] Define new swagger schema for data query API

cb186bd

This defines the schema I've been using to design a QueryEngine (rather than CouchDB) based data query tool.

driusan force-pushed the DQTSwagger branch from d48f379 to cb186bd Compare December 13, 2022 20:38

driusan mentioned this pull request Dec 13, 2022

[DataQuery] New DQT backend implementation #8268

Merged

xlecours requested changes Dec 14, 2022

View reviewed changes

modules/dataquery/static/schema.yml Show resolved Hide resolved

modules/dataquery/static/schema.yml Show resolved Hide resolved

modules/dataquery/static/schema.yml Show resolved Hide resolved

Update schema after Xaviers review

4f6401a

xlecours reviewed Dec 16, 2022

View reviewed changes

modules/dataquery/static/schema.yml Show resolved Hide resolved

xlecours reviewed Dec 16, 2022

View reviewed changes

driusan added 2 commits December 16, 2022 13:36

Create QueryObject

bdda21b

This describes what is POSTed and also goes directly into a subfield of a Query object, not directly into the query.

Change to POST, add QueryRunID to query run

b742752

Split queries and queryruns into two different endpoints

911bd8c

xlecours approved these changes Dec 19, 2022

View reviewed changes

Add QueryURI to QueryRun

ec75e95

driusan closed this Dec 19, 2022

driusan reopened this Dec 19, 2022

maltheism approved these changes Dec 19, 2022

View reviewed changes

driusan merged commit 019db73 into aces:main Dec 19, 2022

ridz1208 added this to the 25.0.0 milestone Mar 6, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[dataquery] Define new swagger schema for data query API #8219

[dataquery] Define new swagger schema for data query API #8219

driusan commented Nov 7, 2022

driusan commented Dec 13, 2022

xlecours commented Dec 14, 2022 •

edited

Loading

driusan commented Dec 16, 2022

xlecours Dec 16, 2022

xlecours Dec 16, 2022

driusan Dec 16, 2022

driusan Dec 16, 2022

xlecours Dec 16, 2022

xlecours commented Dec 16, 2022 •

edited

Loading

driusan commented Dec 16, 2022

xlecours commented Dec 16, 2022 •

edited

Loading

driusan commented Dec 16, 2022

xlecours commented Dec 16, 2022

driusan commented Dec 16, 2022 •

edited

Loading

xlecours commented Dec 16, 2022

driusan commented Dec 16, 2022

driusan commented Dec 19, 2022

xlecours left a comment

driusan commented Dec 19, 2022

driusan commented Dec 19, 2022

driusan commented Dec 19, 2022

[dataquery] Define new swagger schema for data query API #8219

[dataquery] Define new swagger schema for data query API #8219

Conversation

driusan commented Nov 7, 2022

driusan commented Dec 13, 2022

xlecours commented Dec 14, 2022 • edited Loading

driusan commented Dec 16, 2022

xlecours Dec 16, 2022

Choose a reason for hiding this comment

xlecours Dec 16, 2022

Choose a reason for hiding this comment

driusan Dec 16, 2022

Choose a reason for hiding this comment

driusan Dec 16, 2022

Choose a reason for hiding this comment

xlecours Dec 16, 2022

Choose a reason for hiding this comment

xlecours commented Dec 16, 2022 • edited Loading

driusan commented Dec 16, 2022

xlecours commented Dec 16, 2022 • edited Loading

driusan commented Dec 16, 2022

xlecours commented Dec 16, 2022

driusan commented Dec 16, 2022 • edited Loading

xlecours commented Dec 16, 2022

driusan commented Dec 16, 2022

driusan commented Dec 19, 2022

xlecours left a comment

Choose a reason for hiding this comment

driusan commented Dec 19, 2022

driusan commented Dec 19, 2022

driusan commented Dec 19, 2022

xlecours commented Dec 14, 2022 •

edited

Loading

xlecours commented Dec 16, 2022 •

edited

Loading

xlecours commented Dec 16, 2022 •

edited

Loading

driusan commented Dec 16, 2022 •

edited

Loading