Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf(sqla): avoid unnecessary type check on adhoc column #23491

Merged
merged 3 commits into from
Mar 31, 2023

Conversation

villebro
Copy link
Member

@villebro villebro commented Mar 27, 2023

SUMMARY

PR #21163 introduced support for time grains to the base axis for both regular and adhoc columns. This caused a performance regression that executes a query to the analytical database for every adhoc column to check if it's a temporal expression or not.

This PR changes the logic slightly so that the query is executed only if needed, i.e. the base column is adhoc and has a time grain, or if the adhoc column is part of an adhoc filter. To review the PR, please use "Hide whitespace" to see the real changes more clearly.

AFTER

On a chart with two normal adhoc columns, the query now executes in 7 seconds with only a single query being sent to the database:

image

The same chart previously executed in 22 seconds (rougly 3 x 7 seconds with 3 queries sent to the database):

image

TESTING INSTRUCTIONS

  1. create a chart with multiple adhoc columns
  2. look at debug logs and notice an additional query being sent to the database for each adhoc column added

ADDITIONAL INFORMATION

  • Has associated issue:
  • Required feature flags:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

@villebro villebro changed the title perf(sqla): avoid unnecessary type check on adhoc column [WIP] perf(sqla): avoid unnecessary type check on adhoc column Mar 27, 2023
@zhaoyongjie zhaoyongjie self-requested a review March 28, 2023 00:10
Copy link
Member

@zhaoyongjie zhaoyongjie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change LGTM, code logic is the same as before but improves performance. I remember that there are UT covered this part, so this change is safe as well.

@villebro villebro force-pushed the villebro/base_axis_timegrain branch from 2f32401 to de0c2bf Compare March 31, 2023 11:25
@codecov
Copy link

codecov bot commented Mar 31, 2023

Codecov Report

Merging #23491 (5f2c3ce) into master (500d900) will decrease coverage by 0.14%.
The diff coverage is 85.10%.

❗ Current head 5f2c3ce differs from pull request most recent head 3a8b4db. Consider uploading reports for the commit 3a8b4db to get more accurate results

@@            Coverage Diff             @@
##           master   #23491      +/-   ##
==========================================
- Coverage   67.66%   67.53%   -0.14%     
==========================================
  Files        1914     1914              
  Lines       73936    73962      +26     
  Branches     8022     8029       +7     
==========================================
- Hits        50028    49949      -79     
- Misses      21864    21967     +103     
- Partials     2044     2046       +2     
Flag Coverage Δ
hive ?
mysql 78.47% <100.00%> (?)
postgres 78.55% <100.00%> (+<0.01%) ⬆️
presto ?
python 82.07% <100.00%> (-0.28%) ⬇️
sqlite 77.05% <100.00%> (+<0.01%) ⬆️
unit 52.62% <25.00%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...nd/src/dashboard/components/FiltersBadge/index.tsx 84.61% <0.00%> (-1.33%) ⬇️
.../nativeFilters/FilterBar/CrossFilters/Vertical.tsx 10.00% <0.00%> (-1.12%) ⬇️
.../nativeFilters/FilterBar/CrossFilters/selectors.ts 10.00% <0.00%> (ø)
...rc/dashboard/components/nativeFilters/selectors.ts 57.36% <0.00%> (-0.91%) ⬇️
superset/views/base.py 76.87% <ø> (ø)
...hboard/components/nativeFilters/FilterBar/utils.ts 68.18% <85.71%> (+8.18%) ⬆️
...frontend/src/views/CRUD/alert/AlertReportModal.tsx 54.07% <92.30%> (+0.76%) ⬆️
...ilters/FilterBar/FilterControls/FilterControls.tsx 69.73% <100.00%> (+0.40%) ⬆️
.../components/nativeFilters/FilterBar/Horizontal.tsx 96.00% <100.00%> (+0.16%) ⬆️
superset/config.py 91.86% <100.00%> (+0.07%) ⬆️
... and 2 more

... and 9 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@villebro villebro changed the title [WIP] perf(sqla): avoid unnecessary type check on adhoc column perf(sqla): avoid unnecessary type check on adhoc column Mar 31, 2023
@villebro villebro force-pushed the villebro/base_axis_timegrain branch from ec6fa0f to 3a8b4db Compare March 31, 2023 14:07
@villebro villebro requested a review from kgabryje March 31, 2023 14:08
sqla_col = self.adhoc_column_to_sqla(
col=flt_col,
force_type_check=True,
template_processor=template_processor,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bycatch - the template processor was missing here

@@ -28,7 +28,7 @@


@validate_column_args("index", "columns")
def pivot( # pylint: disable=too-many-arguments,too-many-locals
def pivot( # pylint: disable=too-many-arguments
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bycatch: a warning that pylint picked up but wasn't regarded as an error

@villebro villebro merged commit ee9ef24 into apache:master Mar 31, 2023
@villebro villebro deleted the villebro/base_axis_timegrain branch March 31, 2023 15:19
@mistercrunch mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 3.0.0 labels Mar 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels size/M 🚢 3.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants