Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate / improve performance on specific Provider Directory query #1871

Closed
lmsurpre opened this issue Jan 18, 2021 · 3 comments
Closed
Labels
cms-interop This issue is associated with the CMS interoperability rule P2 Priority 2 - Should Have performance performance triage-performance triage-search

Comments

@lmsurpre
Copy link
Member

lmsurpre commented Jan 18, 2021

The HL7 DaVinci PlanNet reference client has a page that sequentially retrieves each page of results for this complicated query:

https://cluster1-573846-250babbbe4c3000e15508cd07c1d282b-0000.us-east.containers.appdomain.cloud/open/Location?_count=10&_has:OrganizationAffiliation:location:network=Organization/plannet-network-HPID010000&_has:OrganizationAffiliation:location:role=pharmacy&_revinclude=OrganizationAffiliation:location:Location

We are doing two reverse chaining searches in the same query. OrgAffiliation.paramA=A and OrgAffiliation.paramB=B. Unfortunately, these are completely independent criteria. So, our latest code retrieves all of the candidates, filters them down to OrganizationAffiliation that satisfy both criteria, and then identifies the associated locations. It is slow. Makes a strong case for downloading the entire directory in FHIR Bulk, loading it into a proper database, and querying against that. Basically, FHIR query interface has its limits. See the code around https://github.com/HL7-DaVinci/plan-net_client/blob/fixPharmacyQuery/app/controllers/pharmacies_controller.rb#L111-L129

The system currently has the planNet sample data loaded, which is only about:

  • 158 OrganizationAffiliations
  • 795 Locations

But the count query is taking over 2 seconds:

Successfully retrieved count. SQL=SELECT COUNT(DISTINCT RESOURCE_ID) FROM (SELECT R.RESOURCE_ID, R.LOGICAL_RESOURCE_ID, R.VERSION_ID, R.LAST_UPDATED, R.IS_DELETED, R.DATA, LR.LOGICAL_ID  FROM Location_LOGICAL_RESOURCES LR  JOIN Location_RESOURCES R ON R.RESOURCE_ID = LR.CURRENT_RESOURCE_ID  AND R.IS_DELETED = 'N'  JOIN ( SELECT CLR0.LOGICAL_ID FROM Location_LOGICAL_RESOURCES AS CLR0 JOIN Location_RESOURCES AS CR0 ON CR0.RESOURCE_ID = CLR0.CURRENT_RESOURCE_ID AND CR0.IS_DELETED = 'N' WHERE  EXISTS (SELECT 1 FROM OrganizationAffiliation_TOKEN_VALUES_V AS CP1 JOIN OrganizationAffiliation_LOGICAL_RESOURCES AS CLR1 ON CLR1.LOGICAL_RESOURCE_ID = CP1.LOGICAL_RESOURCE_ID JOIN OrganizationAffiliation_RESOURCES AS CR1 ON CR1.RESOURCE_ID = CLR1.CURRENT_RESOURCE_ID AND CR1.IS_DELETED = 'N' JOIN OrganizationAffiliation_TOKEN_VALUES_V  AS CP2 ON CP2.LOGICAL_RESOURCE_ID = CLR1.LOGICAL_RESOURCE_ID AND (CP2.PARAMETER_NAME_ID = 1249 AND (CP2.TOKEN_VALUE = ? OR CP2.TOKEN_VALUE = ? OR CP2.TOKEN_VALUE = ? AND CP2.CODE_SYSTEM_ID = 45060)) WHERE CP1.TOKEN_VALUE = CLR0.LOGICAL_ID AND CP1.PARAMETER_NAME_ID = 1224 AND CP1.CODE_SYSTEM_ID = 42073)) AS param0 ON LR.LOGICAL_ID = param0.LOGICAL_ID JOIN ( SELECT CLR0.LOGICAL_ID FROM Location_LOGICAL_RESOURCES AS CLR0 JOIN Location_RESOURCES AS CR0 ON CR0.RESOURCE_ID = CLR0.CURRENT_RESOURCE_ID AND CR0.IS_DELETED = 'N' WHERE  EXISTS (SELECT 1 FROM OrganizationAffiliation_TOKEN_VALUES_V AS CP1 JOIN OrganizationAffiliation_LOGICAL_RESOURCES AS CLR1 ON CLR1.LOGICAL_RESOURCE_ID = CP1.LOGICAL_RESOURCE_ID JOIN OrganizationAffiliation_RESOURCES AS CR1 ON CR1.RESOURCE_ID = CLR1.CURRENT_RESOURCE_ID AND CR1.IS_DELETED = 'N' JOIN OrganizationAffiliation_TOKEN_VALUES_V  AS CP2 ON CP2.LOGICAL_RESOURCE_ID = CLR1.LOGICAL_RESOURCE_ID AND (CP2.PARAMETER_NAME_ID = 1355 AND ((CP2.TOKEN_VALUE = ?))) WHERE CP1.TOKEN_VALUE = CLR0.LOGICAL_ID AND CP1.PARAMETER_NAME_ID = 1224 AND CP1.CODE_SYSTEM_ID = 42073)) AS param1 ON LR.LOGICAL_ID = param1.LOGICAL_ID) COMBINED_RESULTS
  searchArgs=[https://cluster1-573846-250babbbe4c3000e15508cd07c1d282b-0000.us-east.containers.appdomain.cloud/Organization/plannet-network-HPID010000, plannet-network-HPID010000, plannet-network-HPID010000, pharmacy]
  count=419 executionTime=2781.096929ms

And the retrieval is more that 5:

Successfully retrieved FHIR objects. SQL=SELECT RESOURCE_ID, LOGICAL_RESOURCE_ID, VERSION_ID, LAST_UPDATED, IS_DELETED, DATA, LOGICAL_ID FROM (SELECT RESOURCE_ID, LOGICAL_RESOURCE_ID, VERSION_ID, LAST_UPDATED, IS_DELETED, DATA, LOGICAL_ID FROM (SELECT R.RESOURCE_ID, R.LOGICAL_RESOURCE_ID, R.VERSION_ID, R.LAST_UPDATED, R.IS_DELETED, R.DATA, LR.LOGICAL_ID  FROM Location_LOGICAL_RESOURCES LR  JOIN Location_RESOURCES R ON R.RESOURCE_ID = LR.CURRENT_RESOURCE_ID  AND R.IS_DELETED = 'N'  JOIN ( SELECT CLR0.LOGICAL_ID FROM Location_LOGICAL_RESOURCES AS CLR0 JOIN Location_RESOURCES AS CR0 ON CR0.RESOURCE_ID = CLR0.CURRENT_RESOURCE_ID AND CR0.IS_DELETED = 'N' WHERE  EXISTS (SELECT 1 FROM OrganizationAffiliation_TOKEN_VALUES_V AS CP1 JOIN OrganizationAffiliation_LOGICAL_RESOURCES AS CLR1 ON CLR1.LOGICAL_RESOURCE_ID = CP1.LOGICAL_RESOURCE_ID JOIN OrganizationAffiliation_RESOURCES AS CR1 ON CR1.RESOURCE_ID = CLR1.CURRENT_RESOURCE_ID AND CR1.IS_DELETED = 'N' JOIN OrganizationAffiliation_TOKEN_VALUES_V  AS CP2 ON CP2.LOGICAL_RESOURCE_ID = CLR1.LOGICAL_RESOURCE_ID AND (CP2.PARAMETER_NAME_ID = 1249 AND (CP2.TOKEN_VALUE = ? OR CP2.TOKEN_VALUE = ? OR CP2.TOKEN_VALUE = ? AND CP2.CODE_SYSTEM_ID = 45060)) WHERE CP1.TOKEN_VALUE = CLR0.LOGICAL_ID AND CP1.PARAMETER_NAME_ID = 1224 AND CP1.CODE_SYSTEM_ID = 42073)) AS param0 ON LR.LOGICAL_ID = param0.LOGICAL_ID JOIN ( SELECT CLR0.LOGICAL_ID FROM Location_LOGICAL_RESOURCES AS CLR0 JOIN Location_RESOURCES AS CR0 ON CR0.RESOURCE_ID = CLR0.CURRENT_RESOURCE_ID AND CR0.IS_DELETED = 'N' WHERE  EXISTS (SELECT 1 FROM OrganizationAffiliation_TOKEN_VALUES_V AS CP1 JOIN OrganizationAffiliation_LOGICAL_RESOURCES AS CLR1 ON CLR1.LOGICAL_RESOURCE_ID = CP1.LOGICAL_RESOURCE_ID JOIN OrganizationAffiliation_RESOURCES AS CR1 ON CR1.RESOURCE_ID = CLR1.CURRENT_RESOURCE_ID AND CR1.IS_DELETED = 'N' JOIN OrganizationAffiliation_TOKEN_VALUES_V  AS CP2 ON CP2.LOGICAL_RESOURCE_ID = CLR1.LOGICAL_RESOURCE_ID AND (CP2.PARAMETER_NAME_ID = 1355 AND ((CP2.TOKEN_VALUE = ?))) WHERE CP1.TOKEN_VALUE = CLR0.LOGICAL_ID AND CP1.PARAMETER_NAME_ID = 1224 AND CP1.CODE_SYSTEM_ID = 42073)) AS param1 ON LR.LOGICAL_ID = param1.LOGICAL_ID ORDER BY R.LOGICAL_RESOURCE_ID ASC  OFFSET 30 ROWS FETCH NEXT 10 ROWS ONLY) RESULT  UNION ALL SELECT R.RESOURCE_ID, R.LOGICAL_RESOURCE_ID, R.VERSION_ID, R.LAST_UPDATED, R.IS_DELETED, R.DATA, LR.LOGICAL_ID FROM OrganizationAffiliation_RESOURCES R JOIN OrganizationAffiliation_LOGICAL_RESOURCES LR ON R.LOGICAL_RESOURCE_ID=LR.LOGICAL_RESOURCE_ID AND R.RESOURCE_ID = LR.CURRENT_RESOURCE_ID JOIN OrganizationAffiliation_TOKEN_VALUES_V P1 ON P1.LOGICAL_RESOURCE_ID = R.LOGICAL_RESOURCE_ID WHERE R.IS_DELETED = 'N' AND P1.PARAMETER_NAME_ID=1224 AND P1.CODE_SYSTEM_ID=42073 AND P1.TOKEN_VALUE IN (SELECT LR.LOGICAL_ID  FROM Location_LOGICAL_RESOURCES LR  JOIN Location_RESOURCES R ON R.RESOURCE_ID = LR.CURRENT_RESOURCE_ID  AND R.IS_DELETED = 'N'  JOIN ( SELECT CLR0.LOGICAL_ID FROM Location_LOGICAL_RESOURCES AS CLR0 JOIN Location_RESOURCES AS CR0 ON CR0.RESOURCE_ID = CLR0.CURRENT_RESOURCE_ID AND CR0.IS_DELETED = 'N' WHERE  EXISTS (SELECT 1 FROM OrganizationAffiliation_TOKEN_VALUES_V AS CP1 JOIN OrganizationAffiliation_LOGICAL_RESOURCES AS CLR1 ON CLR1.LOGICAL_RESOURCE_ID = CP1.LOGICAL_RESOURCE_ID JOIN OrganizationAffiliation_RESOURCES AS CR1 ON CR1.RESOURCE_ID = CLR1.CURRENT_RESOURCE_ID AND CR1.IS_DELETED = 'N' JOIN OrganizationAffiliation_TOKEN_VALUES_V  AS CP2 ON CP2.LOGICAL_RESOURCE_ID = CLR1.LOGICAL_RESOURCE_ID AND (CP2.PARAMETER_NAME_ID = 1249 AND (CP2.TOKEN_VALUE = ? OR CP2.TOKEN_VALUE = ? OR CP2.TOKEN_VALUE = ? AND CP2.CODE_SYSTEM_ID = 45060)) WHERE CP1.TOKEN_VALUE = CLR0.LOGICAL_ID AND CP1.PARAMETER_NAME_ID = 1224 AND CP1.CODE_SYSTEM_ID = 42073)) AS param0 ON LR.LOGICAL_ID = param0.LOGICAL_ID JOIN ( SELECT CLR0.LOGICAL_ID FROM Location_LOGICAL_RESOURCES AS CLR0 JOIN Location_RESOURCES AS CR0 ON CR0.RESOURCE_ID = CLR0.CURRENT_RESOURCE_ID AND CR0.IS_DELETED = 'N' WHERE  EXISTS (SELECT 1 FROM OrganizationAffiliation_TOKEN_VALUES_V AS CP1 JOIN OrganizationAffiliation_LOGICAL_RESOURCES AS CLR1 ON CLR1.LOGICAL_RESOURCE_ID = CP1.LOGICAL_RESOURCE_ID JOIN OrganizationAffiliation_RESOURCES AS CR1 ON CR1.RESOURCE_ID = CLR1.CURRENT_RESOURCE_ID AND CR1.IS_DELETED = 'N' JOIN OrganizationAffiliation_TOKEN_VALUES_V  AS CP2 ON CP2.LOGICAL_RESOURCE_ID = CLR1.LOGICAL_RESOURCE_ID AND (CP2.PARAMETER_NAME_ID = 1355 AND ((CP2.TOKEN_VALUE = ?))) WHERE CP1.TOKEN_VALUE = CLR0.LOGICAL_ID AND CP1.PARAMETER_NAME_ID = 1224 AND CP1.CODE_SYSTEM_ID = 42073)) AS param1 ON LR.LOGICAL_ID = param1.LOGICAL_ID ORDER BY R.LOGICAL_RESOURCE_ID ASC  OFFSET 30 ROWS FETCH NEXT 10 ROWS ONLY)) COMBINED_RESULTS  searchArgs=[https://cluster1-573846-250babbbe4c3000e15508cd07c1d282b-0000.us-east.containers.appdomain.cloud/Organization/plannet-network-HPID010000, plannet-network-HPID010000, plannet-network-HPID010000, pharmacy, https://cluster1-573846-250babbbe4c3000e15508cd07c1d282b-0000.us-east.containers.appdomain.cloud/Organization/plannet-network-HPID010000, plannet-network-HPID010000, plannet-network-HPID010000, pharmacy] executionTime=5406.138173ms

Anything we can do to speed these up?

@lmsurpre lmsurpre changed the title Investigate / improve performance on specific query from Investigate / improve performance on specific Provider Directory query Jan 18, 2021
@lmsurpre lmsurpre added the performance performance label Jan 18, 2021
@lmsurpre lmsurpre added the cms-interop This issue is associated with the CMS interoperability rule label Jan 27, 2021
@lmsurpre
Copy link
Member Author

Might benefit from #1385

@lmsurpre lmsurpre added this to the Sprint 2021-04 milestone Mar 15, 2021
@punktilious
Copy link
Collaborator

Also the work Mike is doing on #1158 is likely to help here by removing the revinclude from the core query.

@lmsurpre
Copy link
Member Author

After deploying the latest changes, I ran the search from the description:

https://cluster1-573846-250babbbe4c3000e15508cd07c1d282b-0000.us-east.containers.appdomain.cloud/open/Location?_count=10&_has:OrganizationAffiliation:location:network=Organization/plannet-network-HPID010000&_has:OrganizationAffiliation:location:role=pharmacy&_revinclude=OrganizationAffiliation:location:Location

This now results in 3 subsequent queries to the database:

2021-05-11 15:27:55.986 wh-cmsiop-cthon-fhir-internal-6754c44b7-g27cb fhir-internal FINE      query string: 
      SELECT COUNT(*) 
        FROM Location_LOGICAL_RESOURCES AS LR0
       WHERE LR0.IS_DELETED = 'N'
         AND EXISTS (
      SELECT 1 
        FROM Location_LOGICAL_RESOURCES AS LR1 
  INNER JOIN OrganizationAffiliation_TOKEN_VALUES_V AS P2 ON LR1.LOGICAL_ID = P2.TOKEN_VALUE
         AND LR1.VERSION_ID = COALESCE(P2.REF_VERSION_ID,LR1.VERSION_ID)
         AND P2.PARAMETER_NAME_ID = 1224
         AND P2.CODE_SYSTEM_ID = 42073 
  INNER JOIN OrganizationAffiliation_LOGICAL_RESOURCES AS LR2 ON LR2.LOGICAL_RESOURCE_ID = P2.LOGICAL_RESOURCE_ID 
  INNER JOIN OrganizationAffiliation_RESOURCE_TOKEN_REFS AS P3 ON P3.LOGICAL_RESOURCE_ID = LR2.LOGICAL_RESOURCE_ID
         AND P3.PARAMETER_NAME_ID = 1249
         AND (P3.COMMON_TOKEN_VALUE_ID = 36874651) 
  INNER JOIN OrganizationAffiliation_TOKEN_VALUES_V AS P4 ON LR1.LOGICAL_ID = P4.TOKEN_VALUE
         AND LR1.VERSION_ID = COALESCE(P4.REF_VERSION_ID,LR1.VERSION_ID)
         AND P4.PARAMETER_NAME_ID = 1224
         AND P4.CODE_SYSTEM_ID = 42073 
  INNER JOIN OrganizationAffiliation_LOGICAL_RESOURCES AS LR4 ON LR4.LOGICAL_RESOURCE_ID = P4.LOGICAL_RESOURCE_ID 
  INNER JOIN OrganizationAffiliation_RESOURCE_TOKEN_REFS AS P5 ON P5.LOGICAL_RESOURCE_ID = LR4.LOGICAL_RESOURCE_ID
         AND P5.PARAMETER_NAME_ID = 1355
         AND ((P5.COMMON_TOKEN_VALUE_ID IN (40438943,36874605)))
       WHERE LR1.IS_DELETED = 'N'
         AND LR1.LOGICAL_RESOURCE_ID = LR0.LOGICAL_RESOURCE_ID)
2021-05-11 15:27:56.028 wh-cmsiop-cthon-fhir-internal-6754c44b7-g27cb fhir-internal FINE Successfully retrieved count; count=419 [took 41.962133 ms]
2021-05-11 15:27:56.041 wh-cmsiop-cthon-fhir-internal-6754c44b7-g27cb fhir-internal FINE      query string: 
      SELECT R.RESOURCE_ID, R.LOGICAL_RESOURCE_ID, R.VERSION_ID, R.LAST_UPDATED, R.IS_DELETED, R.DATA, LR.LOGICAL_ID 
        FROM (
      SELECT LR0.LOGICAL_RESOURCE_ID, LR0.LOGICAL_ID, LR0.CURRENT_RESOURCE_ID 
        FROM Location_LOGICAL_RESOURCES AS LR0
       WHERE LR0.IS_DELETED = 'N'
         AND EXISTS (
      SELECT 1 
        FROM Location_LOGICAL_RESOURCES AS LR1 
  INNER JOIN OrganizationAffiliation_TOKEN_VALUES_V AS P2 ON LR1.LOGICAL_ID = P2.TOKEN_VALUE
         AND LR1.VERSION_ID = COALESCE(P2.REF_VERSION_ID,LR1.VERSION_ID)
         AND P2.PARAMETER_NAME_ID = 1224
         AND P2.CODE_SYSTEM_ID = 42073 
  INNER JOIN OrganizationAffiliation_LOGICAL_RESOURCES AS LR2 ON LR2.LOGICAL_RESOURCE_ID = P2.LOGICAL_RESOURCE_ID 
  INNER JOIN OrganizationAffiliation_RESOURCE_TOKEN_REFS AS P3 ON P3.LOGICAL_RESOURCE_ID = LR2.LOGICAL_RESOURCE_ID
         AND P3.PARAMETER_NAME_ID = 1249
         AND (P3.COMMON_TOKEN_VALUE_ID = 36874651) 
  INNER JOIN OrganizationAffiliation_TOKEN_VALUES_V AS P4 ON LR1.LOGICAL_ID = P4.TOKEN_VALUE
         AND LR1.VERSION_ID = COALESCE(P4.REF_VERSION_ID,LR1.VERSION_ID)
         AND P4.PARAMETER_NAME_ID = 1224
         AND P4.CODE_SYSTEM_ID = 42073 
  INNER JOIN OrganizationAffiliation_LOGICAL_RESOURCES AS LR4 ON LR4.LOGICAL_RESOURCE_ID = P4.LOGICAL_RESOURCE_ID 
  INNER JOIN OrganizationAffiliation_RESOURCE_TOKEN_REFS AS P5 ON P5.LOGICAL_RESOURCE_ID = LR4.LOGICAL_RESOURCE_ID
         AND P5.PARAMETER_NAME_ID = 1355
         AND ((P5.COMMON_TOKEN_VALUE_ID IN (40438943,36874605)))
       WHERE LR1.IS_DELETED = 'N'
         AND LR1.LOGICAL_RESOURCE_ID = LR0.LOGICAL_RESOURCE_ID)) AS LR 
  INNER JOIN Location_RESOURCES AS R ON LR.CURRENT_RESOURCE_ID = R.RESOURCE_ID
    ORDER BY LR.LOGICAL_RESOURCE_ID
 LIMIT 10
2021-05-11 15:27:56.101 wh-cmsiop-cthon-fhir-internal-6754c44b7-g27cb fhir-internal FINE Successfully retrieved FHIR objects [took 57.189576 ms]
2021-05-11 15:27:56.107 wh-cmsiop-cthon-fhir-internal-6754c44b7-g27cb fhir-internal FINE      query string: 
      SELECT LR.RESOURCE_ID, LR.LOGICAL_RESOURCE_ID, LR.VERSION_ID, LR.LAST_UPDATED, LR.IS_DELETED, R.DATA, LR.LOGICAL_ID 
        FROM (
      SELECT DISTINCT R0.RESOURCE_ID, R0.LOGICAL_RESOURCE_ID, R0.VERSION_ID, R0.LAST_UPDATED, R0.IS_DELETED, LR0.LOGICAL_ID 
        FROM OrganizationAffiliation_LOGICAL_RESOURCES AS LR0 
  INNER JOIN OrganizationAffiliation_TOKEN_VALUES_V AS P1 ON LR0.LOGICAL_RESOURCE_ID = P1.LOGICAL_RESOURCE_ID
         AND P1.PARAMETER_NAME_ID = 1224
         AND P1.CODE_SYSTEM_ID = 42073 
  INNER JOIN Location_LOGICAL_RESOURCES AS LR1 ON LR1.LOGICAL_ID = P1.TOKEN_VALUE
         AND COALESCE(P1.REF_VERSION_ID,LR1.VERSION_ID) = LR1.VERSION_ID
         AND LR1.LOGICAL_RESOURCE_ID IN (31550045,31550055,31550043,31550053,31550051,31550061,31550049,31550059,31550047,31550057)
         AND LR1.IS_DELETED = 'N' 
  INNER JOIN OrganizationAffiliation_RESOURCES AS R0 ON LR0.CURRENT_RESOURCE_ID = R0.RESOURCE_ID) AS LR 
  INNER JOIN OrganizationAffiliation_RESOURCES AS R ON LR.RESOURCE_ID = R.RESOURCE_ID
    ORDER BY LR.LOGICAL_RESOURCE_ID
 LIMIT 1001
2021-05-11 15:27:56.113 wh-cmsiop-cthon-fhir-internal-6754c44b7-g27cb fhir-internal FINE Successfully retrieved FHIR objects [took 4.896218 ms]

The entire request completes in about 0.2 seconds.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cms-interop This issue is associated with the CMS interoperability rule P2 Priority 2 - Should Have performance performance triage-performance triage-search
Projects
None yet
Development

No branches or pull requests

2 participants