Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(schema-compiler): fix FILTER_PARAMS to populate set and notSet filters in cube's SQL query #8937

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

sumeetgajjar
Copy link

@sumeetgajjar sumeetgajjar commented Nov 9, 2024

The presence of the following filters in a cube query: set, notSet, eq NULL or NOT eq NULL renders the corresponding FILTER_PARAMS as 1 = 1 irrespective of the cube or filter.

This is because the renderFilterParams method skips invoking filter.conditionSql when converting a filter to its corresponding SQL due to an empty filterParams array.

For set and notSet, the filter cannot have value(s).
For eq NULL or NOT eq NULL the null value is filtered by:

const params = this.valuesArray().filter(v => v != null);

This leads to an empty filterParams array, thus, skipping the filter SQL, thereby returning 1 = 1.


The issue becomes more serious when the query contains multiple filters on the same dimension.
Consider the following example:

Cube

sql: >
  SELECT
    *
  FROM
    my_partitioned_table
  WHERE
    {FILTER_PARAMS.MyCube.creation_time.filter('creation_time')}

Query

{
    "or":
    [
        {
            "member": "MyCube.creation_time",
            "operator": "afterDate",
            "values":
            [
                "2024-10-31T21:00:00.000Z"
            ]
        },
        {
            "member": "MyCube.creation_time",
            "operator": "notSet"
        }
    ]
}

In the above case, the rendered query would be of the form:

SELECT
  *
FROM
    my_partitioned_table
  WHERE
    (creation_time >=  '2024-10-31T21:00:00.000Z' or (1 = 1))

The right branch of the OR predicate will always evaluate TRUE, thus, the majority of the SQL optimizers will remove the LEFT branch altogether to prevent unnecessary evaluations, thereby defeating the purpose of the FILTER_PARAMS filter pushdown.


This PR aims to fix the bug by restructuring the nested if-else block, thereby also improving the readability of the same.

Check List

  • Tests has been run in packages where changes made if available
  • Linter has been run for changed code
  • Tests for the changes have been added if not covered yet
  • Docs have been added / updated if required

@sumeetgajjar sumeetgajjar requested a review from a team as a code owner November 9, 2024 00:58
Copy link

vercel bot commented Nov 9, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

8 Skipped Deployments
Name Status Preview Comments Updated (UTC)
examples-angular-dashboard ⬜️ Ignored (Inspect) Visit Preview Nov 11, 2024 0:16am
examples-react-d3 ⬜️ Ignored (Inspect) Visit Preview Nov 11, 2024 0:16am
examples-react-dashboard ⬜️ Ignored (Inspect) Visit Preview Nov 11, 2024 0:16am
examples-react-data-table ⬜️ Ignored (Inspect) Visit Preview Nov 11, 2024 0:16am
examples-react-highcharts ⬜️ Ignored (Inspect) Visit Preview Nov 11, 2024 0:16am
examples-react-material-ui ⬜️ Ignored (Inspect) Visit Preview Nov 11, 2024 0:16am
examples-react-pivot-table ⬜️ Ignored (Inspect) Visit Preview Nov 11, 2024 0:16am
examples-vue-query-builder ⬜️ Ignored (Inspect) Visit Preview Nov 11, 2024 0:16am

@github-actions github-actions bot added the pr:community Contribution from Cube.js community members. label Nov 9, 2024
@sumeetgajjar
Copy link
Author

Hi @KSDaemon, @RusovDmitriy, @paveltiunov - can you please review this PR?

Thanks in advance.

@KSDaemon
Copy link
Member

KSDaemon commented Nov 9, 2024

Hi @sumeetgajjar Thank you for contributing! I'll have a look!

@sumeetgajjar
Copy link
Author

sumeetgajjar commented Nov 9, 2024

3 tests under postgres/pre-aggregations.test.js are failing.

This is primarily because the visitors cube during the pre-aggregate generation uses the following SQL:

select * from visitors WHERE ((created_at >= undefined AND created_at <= undefined))\n' +
            '      AND (1 = 1)\n' +

Earlier (before this change) SQL was

select * from visitors WHERE ((1 = 1))\n' +
            '      AND (1 = 1)\n' +

cube(`visitors`, {
      sql: `
      select * from visitors WHERE ${FILTER_PARAMS.visitors.createdAt.filter('created_at')}
      AND ${FILTER_PARAMS.ReferenceOriginalSql.createdAt.filter('created_at')}
      `,

This makes sense as the time dimension filter values are may be unavailable during pre-aggregate generation.

@sumeetgajjar
Copy link
Author

3 tests under postgres/pre-aggregations.test.js are failing.

This is primarily because the visitors cube during the pre-aggregate generation uses the following SQL:

select * from visitors WHERE ((created_at >= undefined AND created_at <= undefined))\n' +
            '      AND (1 = 1)\n' +

Earlier (before this change) SQL was

select * from visitors WHERE ((1 = 1))\n' +
            '      AND (1 = 1)\n' +
cube(`visitors`, {
      sql: `
      select * from visitors WHERE ${FILTER_PARAMS.visitors.createdAt.filter('created_at')}
      AND ${FILTER_PARAMS.ReferenceOriginalSql.createdAt.filter('created_at')}
      `,

This makes sense as the time dimension filter values are may be unavailable during pre-aggregate generation.

Given each of the DATE_OPERATORS_Where methods (i.e. inDateRangeWhere, notInDateRangeWhere, etc) exactly knows its required parameters i.e.

  • for inDateRangeWhere both from and to are required
  • for beforeDateWhere only before is required
  • etc

One fix could be modifying each DATE_OPERATORS_Where method to return 1 = 1 when any required parameters are undefined.

@KSDaemon - Does the above proposal sound good to you?

@sumeetgajjar
Copy link
Author

Hi @sumeetgajjar Thank you for contributing! I'll have a look!

Hi @KSDaemon - gentle ping, can you please take a look?

This bug is directly impacting our production workflows and is leading to incorrect data being shown to the users.

@KSDaemon
Copy link
Member

KSDaemon commented Nov 13, 2024

Does the above proposal sound good to you?

That makes sense.

Seems to be good. I relaunched tests.

Copy link
Member

@KSDaemon KSDaemon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍🏻 LGTM!

@sumeetgajjar
Copy link
Author

👍🏻 LGTM!

Thanks for the review @KSDaemon.

A couple of questions,

  • can you please help us understand the release process and when will the above change be available for public consumption?
  • can this fix be applied to the LTS branch?

@sumeetgajjar
Copy link
Author

Hi @KSDaemon, @paveltiunov, @igorlukanin - can you please help us merge this PR given it is already approved 😄?

@KSDaemon
Copy link
Member

Hi @sumeetgajjar, Sorry for the late answers. Yeah, everything looks right. I think the one missing piece is real integration tests. Please add some tests to packages/cubejs-schema-compiler/test/integration/postgres/sql-generation.test.ts to ensure that real queries are created and run successfully. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr:community Contribution from Cube.js community members.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants