Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DATE_TRUNC('day', date_day) not supported #49

Closed
2 tasks done
BigBerny opened this issue Jul 13, 2024 · 5 comments · Fixed by #34 or #95
Closed
2 tasks done

DATE_TRUNC('day', date_day) not supported #49

BigBerny opened this issue Jul 13, 2024 · 5 comments · Fixed by #34 or #95
Labels
bug Something isn't working good first issue Good for newcomers priority-high High priority issue

Comments

@BigBerny
Copy link

BigBerny commented Jul 13, 2024

What happens?

When running this query I get an error:

SELECT DATE_TRUNC('day', date_day), AVG("queryLatency")
FROM predictions
GROUP BY DATE_TRUNC('day', date_day)

Error:
Query 1 ERROR at Line 1: : ERROR: Column date_trunc has Arrow data type Date32 but is mapped to the BuiltIn(TIMESTAMPOID) type in Postgres, which are incompatible. If you believe this conversion should be supported, please submit a request at https://github.com/paradedb/paradedb/issues.

Since DATE_TRUNC() is quite fundamental for many analysis, for whenever you want to group by day, week, month or year, it would be awesome if this could be fixed somehow.
Tools like Metabase often offer users a way to choose granularity of an analysis

To Reproduce

Do a GROUP BY on DATE_TRUNC(). It happens with date and timestamp columns and also for 'week' etc.

OS:

Ubuntu(?) with PostgreSQL 16.3

ParadeDB Version:

0.8.3

Full Name:

Janis

Affiliation:

Typewise

What is the latest build you tested with? If possible, we recommend testing by compiling the latest dev branch.

I have tested with a stable release

Did you include all relevant data sets for reproducing the issue?

Yes

Did you include the code required to reproduce the issue?

  • Yes, I have

Did you include all relevant configurations (e.g., CPU architecture, PostgreSQL version, Linux distribution) to reproduce the issue?

  • Yes, I have
@philippemnoel philippemnoel added the bug Something isn't working label Jul 13, 2024
@philippemnoel
Copy link
Collaborator

Thank you for reporting, we'll get this fixed ASAP.

@Weijun-H
Copy link
Contributor

Weijun-H commented Aug 6, 2024

Hi @BigBerny , could you provide more details for the reproduction? I am diving into this case but no error found.

pg_lakehouse=# CREATE TEMP TABLE predictions (
  date_day DATE,
  queryLatency INT
);
CREATE TABLE

pg_lakehouse=# INSERT INTO predictions (date_day, queryLatency) VALUES
  ('2023-03-01', 100),
  ('2023-03-01', 150),
  ('2023-03-01', 200),
  ('2023-03-02', 50),
  ('2023-03-02', 100),
  ('2023-03-03', 250),
  ('2023-03-03', 300);
INSERT 0 7

pg_lakehouse=# SELECT DATE_TRUNC('day', date_day), AVG(queryLatency) 
FROM predictions 
GROUP BY DATE_TRUNC('day', date_day);
       date_trunc       |         avg          
------------------------+----------------------
 2023-03-03 00:00:00+08 | 275.0000000000000000
 2023-03-01 00:00:00+08 | 150.0000000000000000
 2023-03-02 00:00:00+08 |  75.0000000000000000
(3 rows)

@evanxg852000
Copy link
Contributor

@Weijun-H Unless I missed something, the temp table is directly sitting in PostgreSQL instead of being in pg_analytic's DuckDB. I used your example to create a parquet file, loaded it, and got the same error as reported.

-- LOAD PARQUET
CREATE FOREIGN DATA WRAPPER parquet_wrapper
HANDLER parquet_fdw_handler
VALIDATOR parquet_fdw_validator;

CREATE SERVER parquet_server
FOREIGN DATA WRAPPER parquet_wrapper;

CREATE FOREIGN TABLE latencies ()
SERVER parquet_server
OPTIONS (files '/datasets/latencies.parquet');

SELECT DATE_TRUNC('day', date_day), AVG(query_latency) 
FROM latencies 
GROUP BY DATE_TRUNC('day', date_day);

@BigBerny
Copy link
Author

BigBerny commented Aug 6, 2024

I used parquet files read by pg_lakehouse/duckdb as said by @evanxg852000

@philippemnoel
Copy link
Collaborator

@evanxg852000 I've reverted this PR as the tests were not passing. We simply had not noticed that the CI was not running

@philippemnoel philippemnoel added priority-high High priority issue and removed priority-medium Medium priority issue labels Aug 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working good first issue Good for newcomers priority-high High priority issue
Projects
None yet
4 participants