-
Notifications
You must be signed in to change notification settings - Fork 13.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(chart-filter): Avoid column denormalization if not enabled #26199
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #26199 +/- ##
=======================================
Coverage 69.18% 69.18%
=======================================
Files 1944 1944
Lines 75925 75928 +3
Branches 8451 8451
=======================================
+ Hits 52531 52534 +3
Misses 21209 21209
Partials 2185 2185
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes sense. The original naming could be better and definitely needs a cleanup, seems like we're using "denormalize" for "don't normalize". Ideally we'd have a boolean feature called something like needs_normalization
, and we'd do normalization/denormalization as needed, otherwise we wouldn't touch column names.
thanks @betodealmeida. @hughhhh @villebro I know you've worked recently with this feature. Do you have any concerns with this PR? thank you! |
@@ -1340,14 +1340,19 @@ def get_time_filter( # pylint: disable=too-many-arguments | |||
) | |||
return and_(*l) | |||
|
|||
def values_for_column(self, column_name: str, limit: int = 10000) -> list[Any]: | |||
# always denormalize column name before querying for values | |||
def values_for_column( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do our unit tests already have cases where it covers if denormalize_column
returns the correct values for both true and false already? If not, I would add tests to confirm that the correct things are returned for when the flag is true since we're defaulting to false.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good idea. However, this part of the codebase is very tricky to add tests for, so it may be difficult. If it's easy to add the test I suggest doing it, otherwise LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sadpandajoe @villebro I just added some basic tests to #26220 (since this one got merged). It doesn't test the logic implemented in the DB engine level to denormalize a column (I believe this is DB-specific and would require a more complex setup), but should be at least testing the business logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for the fix. My apologies for contributing to the confusing naming here, it seems I hadn't thought it fully through (is it really in fact denormalizing, or not normalizing etc)..
@@ -1340,14 +1340,19 @@ def get_time_filter( # pylint: disable=too-many-arguments | |||
) | |||
return and_(*l) | |||
|
|||
def values_for_column(self, column_name: str, limit: int = 10000) -> list[Any]: | |||
# always denormalize column name before querying for values | |||
def values_for_column( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good idea. However, this part of the codebase is very tricky to add tests for, so it may be difficult. If it's easy to add the test I suggest doing it, otherwise LGTM
@Vitor-Avila Tip: If you include the text "Fixes: #26198" in the PR description, when the PR is merged, the issue is automatically closed. This only works for the description, not comments. |
I don't know why but when it's part of a checkbox ( |
@michael-s-molina thanks for the tip! I remember a previous PR I created did automatically closed the bug, but I never understood why. I'll make sure to include that out of the checkbox next time 🙌 |
(cherry picked from commit 05d7060)
(cherry picked from commit 05d7060)
SUMMARY
Avoid de-normalizing column names in case the engine supports it but column normalization is enabled in the dataset level. This is enabled by default to datasets created prior to this feature, to make sure that syncing columns wouldn't break existing charts/etc.
BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
Before
Chart.Filter.issue.-.new.mov
After
Chart.Filter.fixed.-.new.mov
TESTING INSTRUCTIONS
a. All columns are uppercase.
b. Column normalization is disabled (under the SETTINGS tab).
ADDITIONAL INFORMATION