-
Notifications
You must be signed in to change notification settings - Fork 305
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug][Spark]: Spark write timestamp value to parquet as UTC+0 hour. #686
Comments
spark connector writing data to file is ( value - 8h ). it seems to spark handles timestamp-without-zone via parse value as timestamp-with-zone and zone of spark configured, and spark writes data to file by UTC timestamp, so spark writes data as (value - 8h) |
It seems that it is not only the arctic table that has problems. For the iceberg table, after my test, flink writes the iceberg table and the time read by spark will also be increased as (value + 8h). |
Add the following configuration to spark-default.conf: spark.driver.extraJavaOptions -Duser.timezone=UTC Making sure that the same timeZone is written and read can temporarily solve this problem |
@hellojinsilei |
Flink cannot write timestamp type with timeZone @lklhdu Can we do that? |
In my understanding, TIMESTAMP is a field in Flink without a time zone attribute, if we need to set a time zone for the field, we should use TIMESTAMP WITH LOCAL TIME ZONE. |
@zhoujinsong Table created by terminal via spark sqls. I think it should be For the adapt hive table, the default action of spark field type timestamp should be timestamp without-zone, for the non-hive adapt table, the default action should be timestamp with-zone. |
What happened?
When table schema contains a field as timestamp without-zone, the value spark engines write to the parquet file is UTC+0, but should be the current timezone value.
Affects Versions
0.4.0
What engines are you seeing the problem on?
Spark
How to reproduce
checked .
select via flink
select via spark
table schema
table files
seems spark problems
Relevant log output
No response
Anything else
for sql `insert into test_db.test_table values (7, 'randy', timestamp('2022-07-03 19:11:00'));
the flink connector write data
spark connector write data
Code of Conduct
The text was updated successfully, but these errors were encountered: