Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lack of documentation clarity over datetime and time zone handling #218

Open
candlerb opened this issue Sep 6, 2024 · 1 comment
Open
Assignees

Comments

@candlerb
Copy link
Collaborator

candlerb commented Sep 6, 2024

Describe what's wrong
Using select to_time('foo','zone1') and select to_time('foo','zone2'), they both display as if they were the same local time foo, even though they are different times.

I had to refer to Clickhouse's documentation to understand what's going on.

How to reproduce

select to_time('6/9/2024 13:00','UTC') as t1, to_time('6/9/2024 13:00','Europe/London') as t2;
┌──────────────────────t1─┬──────────────────────t2─┐
│ 2024-06-09 13:00:00.000 │ 2024-06-09 13:00:00.000 │
└─────────────────────────┴─────────────────────────┘

They appear to be the same time. But they are not!

select to_time('6/9/2024 13:00','UTC') = to_time('6/9/2024 13:00','Europe/London');
┌─equals(to_time('6/9/2024 13:00', 'UTC'), to_time('6/9/2024 13:00', 'Europe/London'))─┐
│ false                                                                                │
└──────────────────────────────────────────────────────────────────────────────────────┘

select to_time('6/9/2024 13:00','UTC')::int as t1, to_time('6/9/2024 13:00','Europe/London')::int as t2;
┌─────────t1─┬─────────t2─┐
│ 1717938000 │ 1717934400 │
└────────────┴────────────┘

And they are clearly different if you store them to a stream and read them back:

create stream foo (id int, t datetime);
insert into foo (id, t) values (1, to_time('6/9/2024 13:00','UTC')), (2, to_time('6/9/2024 13:00','Europe/London'));
select id,t from table(foo);
┌─id─┬───────────────────t─┐
│  1 │ 2024-06-09 14:00:00 │
│  2 │ 2024-06-09 13:00:00 │
└────┴─────────────────────┘

Analysis

(I believe the following to be true, but please correct me if I'm wrong)

In Proton expressions, Datetime is not a single type, but a parameterized family of types: DateTime([timezone]). The parameter is optional, in which case the system timezone is implied.

select to_type_name(to_time('6/9/2024 13:00','UTC')) as tp1, to_type_name(to_time('6/9/2024 13:00','Europe/London')) as tp2;
┌─tp1──────────────────┬─tp2────────────────────────────┐
│ datetime64(3, 'UTC') │ datetime64(3, 'Europe/London') │
└──────────────────────┴────────────────────────────────┘

That is, the function to_time(...) returns a value of a dynamic type, depending on the timezone argument. Since select only shows the value and not the type, the timezone information is hidden from the user.

The fact that Datetime is a parameterized type is documented for Clickhouse.

But this is not documented for Proton as far as I can see. In the SQL Reference > Data Types page, it just shows:

image

However, the representation once the value is stored in the database is different (again, according to Clickhouse documentation). A Datetime column stores an absolute timestamp, in Unix seconds past epoch, as a 32-bit number; the timezone is calculated away. Hence when you read it back out, it gets displayed in whatever timezone is the default for the system (or the timezone selected in the type definition of the column).

select id,t,to_type_name(t) from table(foo);
┌─id─┬───────────────────t─┬─to_type_name(t)─┐
│  1 │ 2024-06-09 14:00:00 │ datetime        │
│  2 │ 2024-06-09 13:00:00 │ datetime        │
└────┴─────────────────────┴─────────────────┘

In this case, as no timezone was given in the column definition, I get just a plain datetime and not the more specific datetime('Europe/London') - although implicitly it must have used the system timezone when converting from the stored int32 to this representation.

I think that storing the absolute Unix time in the database is a good choice. Hence I would says it's just a documentation bug that this isn't made clear (especially since it differs from most other SQL databases)

Additional context
I am using proton 1.5.17 installed from homebrew under macOS 14.6.1. The system timezone is Europe/London

@jovezhong jovezhong self-assigned this Sep 6, 2024
@jovezhong
Copy link
Contributor

cross link, this is also discussed in timeplus-io/proton#835

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants