-
Notifications
You must be signed in to change notification settings - Fork 157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
whole-system-search: use global parameter tables in whole system search join #1922
Comments
Signed-off-by: Mike Schroeder <mschroed@us.ibm.com>
Signed-off-by: Mike Schroeder <mschroed@us.ibm.com>
Signed-off-by: Mike Schroeder <mschroed@us.ibm.com>
Signed-off-by: Mike Schroeder <mschroed@us.ibm.com>
Signed-off-by: Mike Schroeder <mschroed@us.ibm.com>
Signed-off-by: Mike Schroeder <mschroed@us.ibm.com>
Signed-off-by: Mike Schroeder <mschroed@us.ibm.com>
Signed-off-by: Mike Schroeder <mschroed@us.ibm.com>
Signed-off-by: Mike Schroeder <mschroed@us.ibm.com>
Signed-off-by: Mike Schroeder <mschroed@us.ibm.com>
Signed-off-by: Mike Schroeder <mschroed@us.ibm.com>
Signed-off-by: Mike Schroeder <mschroed@us.ibm.com>
Signed-off-by: Mike Schroeder <mschroed@us.ibm.com>
Signed-off-by: Mike Schroeder <mschroed@us.ibm.com>
Signed-off-by: Mike Schroeder <mschroed@us.ibm.com>
Signed-off-by: Mike Schroeder <mschroed@us.ibm.com>
Signed-off-by: Mike Schroeder <mschroed@us.ibm.com>
Signed-off-by: Mike Schroeder <mschroed@us.ibm.com>
Signed-off-by: Mike Schroeder <mschroed@us.ibm.com>
Signed-off-by: Mike Schroeder <mschroed@us.ibm.com>
Signed-off-by: Mike Schroeder <mschroed@us.ibm.com>
Issue #1922 - implement whole-system search in new query builder
Signed-off-by: Mike Schroeder <mschroed@us.ibm.com>
Signed-off-by: Mike Schroeder <mschroed@us.ibm.com>
Issue #1922 - don't run reindex tests during search tests
There are four different scenarios for whole-system search, each of which will generate different SQL:
Global search parameters are search parameters defined for all resources, and for which we have system-wide tables in which to store the indexed values. We currently support the following global search parameters: For scenarios 1 and 2, where only global search parameters are specified, we'll execute a "filter" query to get the list of logical resource IDs which match the search criteria, and then we'll execute a "data" query to get a page worth of resource data, which will be returned to the caller. Here's a filter query for scenario 1:
For scenario 2, the filter query is the same as above, only with an additional where clause to select only the resource types specified on the
In both scenarios 1 and 2, the data query will be a UNION of type-specific resource tables based on the resource types of the logical resource IDs returned by the filter query. Since we're only getting one page worth of data, this will typically be a small number of resource types. Here's an example:
For scenarios 3 and 4, we'll build one large UNION'd query to get the data, like we do today with the old query builder. We have to do it that way since the search parameter index values will be in the resource type-specific values tables rather than system-wide values tables. Here is an example query for scenario 4:
For scenario 3, the query would be the same, except there would be a UNION'd sub-query for all supported resource types. |
Ran the search bucket against this latest code as suggested. Some new limitations in the search that used to be supported. Not sure if that was intentional, suspected a bug. Moving this back to in progress.
now gets
|
My mistake I ran an outdated search bucket. I reran the current search bucket where those statements had been updated. and the tests pass now. Closing issue. |
Related to #264. Once resource ingestion populates the global parameter tables (parameter tables which are tied directly to the global LOGICAL_RESOURCES table, the whole system search query can be rewritten to greatly simplify the join.
The best implementation to retrieve the data payload for each resource type needs to be investigated. Options include a WITH+UNION or perhaps an outer join with a large coalesce statement. To simplify the join as much as possible, it may be necessary to hold the payload at the global layer, which would help to align the design with the ability to offload to the payload to external storage. These options should be evaluated for their impact on performance before any changes are made to the search query build code.
For example, the following structure may perform reasonably well:
Although this form requires the common table expression (CTE) result to be scanned multiple times even though it is evaluated only once. The 'resource_type = 'Claim' predicate in the outer join helps to avoid any logical reads for rows not matching the particular resource type for that sub-query which should keep CPU usage lower.
An alternative is to outer-join each of the resource-specific tables to the main whole-system-search subquery:
The downside of this is we make one logical read on every resource-specific logical-resources table for every matching row from the search (the
driver
sub-query). There are 139 resource-types in the FHIR-R4 model, requiring 139 logical reads per matching row (only one of which will actually match).The text was updated successfully, but these errors were encountered: