fix(chart-filter): Avoid column denormalization if not enabled #26199

Vitor-Avila · 2023-12-06T20:54:00Z

SUMMARY

Avoid de-normalizing column names in case the engine supports it but column normalization is enabled in the dataset level. This is enabled by default to datasets created prior to this feature, to make sure that syncing columns wouldn't break existing charts/etc.

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

Before

Chart.Filter.issue.-.new.mov

After

Chart.Filter.fixed.-.new.mov

TESTING INSTRUCTIONS

Create a dataset powered by an engine that supports column de-normalization (such as Snowflake). Note that:
a. All columns are uppercase.
b. Column normalization is disabled (under the SETTINGS tab).
Modify the dataset, and enable column normalization.
Save changes.
Modify the dataset again, and sync columns. Note that all columns are now lowercase.
Save changes.
Create a new chart using this dataset, and drop any column in the FILTERS section.
Validate the filter is showing available options in the dropdown.

ADDITIONAL INFORMATION

Has associated issue: Fixes Chart filter options are not populated for datasets with normalize column enabled #26198
Required feature flags:
Changes UI
Includes DB Migration (follow approval process in SIP-59)
- Migration is atomic, supports rollback & is backwards-compatible
- Confirm DB migration upgrade and downgrade tested
- Runtime estimates and downtime expectations provided
Introduces new feature or API
Removes existing feature or API

codecov · 2023-12-06T20:59:12Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (4d4b19e) 69.18% compared to head (7953e45) 69.18%.
Report is 2 commits behind head on master.

Additional details and impacted files

@@           Coverage Diff           @@
##           master   #26199   +/-   ##
=======================================
  Coverage   69.18%   69.18%           
=======================================
  Files        1944     1944           
  Lines       75925    75928    +3     
  Branches     8451     8451           
=======================================
+ Hits        52531    52534    +3     
  Misses      21209    21209           
  Partials     2185     2185

Flag	Coverage Δ
hive	`53.68% <25.00%> (-0.01%)`	⬇️
mysql	`78.10% <100.00%> (+<0.01%)`	⬆️
postgres	`78.19% <100.00%> (+<0.01%)`	⬆️
presto	`53.64% <25.00%> (-0.01%)`	⬇️
python	`82.88% <100.00%> (+<0.01%)`	⬆️
sqlite	`76.85% <100.00%> (+<0.01%)`	⬆️
unit	`55.81% <25.00%> (-0.01%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

betodealmeida

This makes sense. The original naming could be better and definitely needs a cleanup, seems like we're using "denormalize" for "don't normalize". Ideally we'd have a boolean feature called something like needs_normalization, and we'd do normalization/denormalization as needed, otherwise we wouldn't touch column names.

Vitor-Avila · 2023-12-07T12:54:51Z

thanks @betodealmeida. @hughhhh @villebro I know you've worked recently with this feature. Do you have any concerns with this PR? thank you!

sadpandajoe · 2023-12-07T17:35:31Z

superset/models/helpers.py

@@ -1340,14 +1340,19 @@ def get_time_filter( # pylint: disable=too-many-arguments
 )
 return and_(*l)

- def values_for_column(self, column_name: str, limit: int = 10000) -> list[Any]:
- # always denormalize column name before querying for values
+ def values_for_column(


Do our unit tests already have cases where it covers if denormalize_column returns the correct values for both true and false already? If not, I would add tests to confirm that the correct things are returned for when the flag is true since we're defaulting to false.

This is a good idea. However, this part of the codebase is very tricky to add tests for, so it may be difficult. If it's easy to add the test I suggest doing it, otherwise LGTM

@sadpandajoe @villebro I just added some basic tests to #26220 (since this one got merged). It doesn't test the logic implemented in the DB engine level to denormalize a column (I believe this is DB-specific and would require a more complex setup), but should be at least testing the business logic.

villebro

LGTM, thanks for the fix. My apologies for contributing to the confusing naming here, it seems I hadn't thought it fully through (is it really in fact denormalizing, or not normalizing etc)..

villebro · 2023-12-07T17:39:14Z

superset/models/helpers.py

@@ -1340,14 +1340,19 @@ def get_time_filter( # pylint: disable=too-many-arguments
 )
 return and_(*l)

- def values_for_column(self, column_name: str, limit: int = 10000) -> list[Any]:
- # always denormalize column name before querying for values
+ def values_for_column(


This is a good idea. However, this part of the codebase is very tricky to add tests for, so it may be difficult. If it's easy to add the test I suggest doing it, otherwise LGTM

michael-s-molina · 2023-12-07T22:03:45Z

@Vitor-Avila Tip: If you include the text "Fixes: #26198" in the PR description, when the PR is merged, the issue is automatically closed. This only works for the description, not comments.

michael-s-molina · 2023-12-07T22:06:07Z

I don't know why but when it's part of a checkbox (Has associated issue in the template) it does not work. To see if the link worked, you can check the original issue for a message saying that the issue will be closed by the PR.

Vitor-Avila · 2023-12-08T01:43:37Z

@michael-s-molina thanks for the tip! I remember a previous PR I created did automatically closed the bug, but I never understood why. I'll make sure to include that out of the checkbox next time 🙌

(cherry picked from commit 05d7060)

…e#26199) (cherry picked from commit 05d7060)

…e#26199)

fix(chart-filter): Avoid column denormalization if not enabled

510af9d

pull-request-size bot added the size/S label Dec 6, 2023

betodealmeida approved these changes Dec 7, 2023

View reviewed changes

Fix pylint issues

7953e45

sadpandajoe reviewed Dec 7, 2023

View reviewed changes

villebro approved these changes Dec 7, 2023

View reviewed changes

eschutho merged commit 05d7060 into apache:master Dec 8, 2023
33 checks passed

Vitor-Avila mentioned this pull request Dec 8, 2023

chore(tests): Add tests to the column denormalization flow #26220

Merged

9 tasks

michael-s-molina added v3.0 Label added by the release manager to track PRs to be included in the 3.0 branch v3.1 Label added by the release manager to track PRs to be included in the 3.1 branch labels Dec 8, 2023

michael-s-molina pushed a commit that referenced this pull request Dec 8, 2023

fix(chart-filter): Avoid column denormalization if not enabled (#26199)

8eb6bbb

(cherry picked from commit 05d7060)

michael-s-molina pushed a commit that referenced this pull request Dec 8, 2023

fix(chart-filter): Avoid column denormalization if not enabled (#26199)

b699df7

(cherry picked from commit 05d7060)

jinghua-qa pushed a commit to preset-io/superset that referenced this pull request Dec 8, 2023

fix(chart-filter): Avoid column denormalization if not enabled (apach…

47abaea

…e#26199) (cherry picked from commit 05d7060)

jinghua-qa added the preset:2023.49 label Dec 8, 2023

sadpandajoe pushed a commit to preset-io/superset that referenced this pull request Dec 11, 2023

fix(chart-filter): Avoid column denormalization if not enabled (apach…

3395460

…e#26199) (cherry picked from commit 05d7060)

josedev-union pushed a commit to Ortege-xyz/studio that referenced this pull request Jan 22, 2024

fix(chart-filter): Avoid column denormalization if not enabled (apach…

581877c

…e#26199) (cherry picked from commit 05d7060)

mistercrunch added 🍒 3.0.3 🍒 3.0.4 🍒 3.1.0 🍒 3.1.1 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels labels Mar 8, 2024

sfirke pushed a commit to sfirke/superset that referenced this pull request Mar 22, 2024

fix(chart-filter): Avoid column denormalization if not enabled (apach…

0f75003

…e#26199)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(chart-filter): Avoid column denormalization if not enabled #26199

fix(chart-filter): Avoid column denormalization if not enabled #26199

Vitor-Avila commented Dec 6, 2023

codecov bot commented Dec 6, 2023 •

edited

Loading

betodealmeida left a comment

Vitor-Avila commented Dec 7, 2023 •

edited

Loading

sadpandajoe Dec 7, 2023

villebro Dec 7, 2023

Vitor-Avila Dec 8, 2023

villebro left a comment

villebro Dec 7, 2023

michael-s-molina commented Dec 7, 2023

michael-s-molina commented Dec 7, 2023

Vitor-Avila commented Dec 8, 2023

fix(chart-filter): Avoid column denormalization if not enabled #26199

fix(chart-filter): Avoid column denormalization if not enabled #26199

Conversation

Vitor-Avila commented Dec 6, 2023

SUMMARY

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

TESTING INSTRUCTIONS

ADDITIONAL INFORMATION

codecov bot commented Dec 6, 2023 • edited Loading

Codecov Report

betodealmeida left a comment

Choose a reason for hiding this comment

Vitor-Avila commented Dec 7, 2023 • edited Loading

sadpandajoe Dec 7, 2023

Choose a reason for hiding this comment

villebro Dec 7, 2023

Choose a reason for hiding this comment

Vitor-Avila Dec 8, 2023

Choose a reason for hiding this comment

villebro left a comment

Choose a reason for hiding this comment

villebro Dec 7, 2023

Choose a reason for hiding this comment

michael-s-molina commented Dec 7, 2023

michael-s-molina commented Dec 7, 2023

Vitor-Avila commented Dec 8, 2023

codecov bot commented Dec 6, 2023 •

edited

Loading

Vitor-Avila commented Dec 7, 2023 •

edited

Loading