-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
Describe the bug
datafusion.execution.parquet.coerce_int96
is supposed to
If true, parquet reader will read columns of physical type int96 as originating from a different resolution than nanosecond. This is useful for reading data from systems like Spark which stores microsecond resolution timestamps in an int96 allowing it to write values with a larger date range than 64-bit timestamps with nanosecond resolution.
However, when I set this to ms
the type is still reported to be Timestamp(Nanoseconds)
To Reproduce
-- Enable coercion of int96 to microseconds
set datafusion.execution.parquet.coerce_int96 = ms;
-- Create external table
CREATE EXTERNAL TABLE int96_from_spark
STORED AS PARQUET
LOCATION 'parquet-testing/data/int96_from_spark.parquet';
-- Print schema
describe int96_from_spark;
Results in
+-------------+-----------------------------+-------------+
| column_name | data_type | is_nullable |
+-------------+-----------------------------+-------------+
| a | Timestamp(Nanosecond, None) | YES |
+-------------+-----------------------------+-------------+
1 row(s) fetched.
Elapsed 0.001 seconds.
Expected behavior
I expect the output type to be Timestamp(Microsecond, None)
Additional context
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working