Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CubeSQL] Regression past 1.1.4 #8967

Closed
pauldheinrichs opened this issue Nov 18, 2024 · 5 comments
Closed

[CubeSQL] Regression past 1.1.4 #8967

pauldheinrichs opened this issue Nov 18, 2024 · 5 comments
Assignees
Labels
bug Something isn't working

Comments

@pauldheinrichs
Copy link

pauldheinrichs commented Nov 18, 2024

Describe the bug
Queries generated via metabase are breaking pretty frequently past 1.1.4 when containing time clauses and groupings on time.

To Reproduce
Steps to reproduce the behavior:
1: leverage the following cube in a postgres / redshift env

Minimally reproducible Cube Schema

cubes:
  - name: test_cube
    sql: SELECT CURRENT_DATE as day_one, 1 as some_val

    dimensions:
      - name: day_one
        sql: day_one
        type: time
        primary_key: true
        public: true

      - name: some_val
        sql: some_val
        type: number
  

The following queries fail

SELECT
  CEIL(EXTRACT(DOY FROM DATE_TRUNC('week', "public"."test_cube"."day_one")) / 7.0) AS week_of_year,
  count(distinct some_val)
FROM
  "public"."test_cube"
WHERE
  "public"."test_cube"."day_one" >= DATE_TRUNC('year', NOW())
  AND "public"."test_cube"."day_one" < DATE_TRUNC('year', NOW() + INTERVAL '1 year')
GROUP BY
  week_of_year
ORDER BY
  week_of_year ASC;

with

Internal error: Can't generate SQL for literal: TimestampNanosecond(1704067200000000000, Some("UTC")). This was likely caused by a bug in DataFusion's code and we would welcome that you file an bug report in our issue tracker
QUERY: SELECT CEIL(EXTRACT(DOY FROM DATE_TRUNC('week', "public"."test_cube"."day_one")) / 7.0) AS week_of_year, count(DISTINCT some_val) FROM "public"."test_cube" WHERE "public"."test_cube"."day_one" >= DATE_TRUNC('year', NOW()) AND "public"."test_cube"."day_one" < DATE_TRUNC('year', NOW() + INTERVAL '1 year') GROUP BY week_of_year ORDER BY week_of_year ASC

I have smoke tested the query in several ways and here are the outcomes when modifying

|---|---|

Adjustment Pass / fail
Remove count(distinct some_val) Pass
Remove WHERE clause Pass
adjust count distinct to count Fail
Remove ordering fail
Reduce time cast complexity DATE_TRUNC('week', "public"."test_cube"."day_one") fail

Some failing examples

SELECT
  DATE_TRUNC('week', "public"."test_cube"."day_one") AS week_of_year,
  count(distinct some_val)
FROM
  "public"."test_cube"
WHERE
  "public"."test_cube"."day_one" >= DATE_TRUNC('year', NOW())
  AND "public"."test_cube"."day_one" < DATE_TRUNC('year', NOW() + INTERVAL '1 year')
GROUP BY
  1

Some passing example queries from the above chart

SELECT
  DATE_TRUNC('week', "public"."test_cube"."day_one") AS week_of_year,
  count(distinct some_val)
FROM
  "public"."test_cube"
GROUP BY
  1
SELECT
  DATE_TRUNC('week', "public"."test_cube"."day_one") AS week_of_year
FROM
  "public"."test_cube"
@paveltiunov paveltiunov added the bug Something isn't working label Nov 18, 2024
@mcheshkov
Copy link
Member

Hi @pauldheinrichs. Big thanks for a detailed report.

I managed to simplify query to this in our data model for unit tests:

SELECT
    customer_gender
FROM KibanaSampleDataEcommerce
WHERE
    order_date >= NOW()
    AND
    LOWER(customer_gender) = 'foo'
;

It breaks with same Can't generate SQL for literal: TimestampNanosecond(..., Some(\"UTC\"))..
I need a bit more time to find the culprit.

@mcheshkov
Copy link
Member

mcheshkov commented Nov 19, 2024

Localized the issue, it's a combination of NOW() and SQL pushdown query.
Constant folding at late stage evaluates NOW() with DataFusion, and turns it into ScalarValue::TimestampNanosecond(..., Some("UTC")), and this breaks later during SQL generation for pushdown.

Both NOW and pushdown are important here, which is why when you removed either WHERE (that had NOW()) or count(distinct dimension) (that triggered pushdown) it worked. In my example LOWER(dimension) triggered pushdown.

@mcheshkov
Copy link
Member

mcheshkov commented Nov 20, 2024

I've just merged #8971, should be fixed in next release. I'll ping again when release is ready.

@mcheshkov
Copy link
Member

v1.1.7 was just released, it's already available on Docker Hub, soon will be on Cloud.

@pauldheinrichs
Copy link
Author

@mcheshkov Thanks for the quick turnaround here! Appreciate it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants