Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Format] Define more generic Interval logical type #15623

Closed
asfimport opened this issue Aug 23, 2016 · 8 comments
Closed

[Format] Define more generic Interval logical type #15623

asfimport opened this issue Aug 23, 2016 · 8 comments

Comments

@asfimport
Copy link
Collaborator

asfimport commented Aug 23, 2016

Per discussion in e7e399d#commitcomment-18711366, we can create an Interval type with a unit to be more general.

Reporter: Wes McKinney / @wesm
Assignee: Julien Le Dem / @julienledem

Related issues:

PRs and other links:

Note: This issue was originally created as ARROW-270. Please see the migration documentation for further details.

@asfimport
Copy link
Collaborator Author

@asfimport
Copy link
Collaborator Author

Uwe Korn / @xhochy:
Given that there are discussion to change the timestamp type to be based on Decimals and the unit introduced by the scale, shouldn't we also do this with the Interval type then?

@asfimport
Copy link
Collaborator Author

Jacques Nadeau / @jacques-n:
IntervalUnit seems fine to me.

As far as timestamp/decimal, I'm not inclined to change. I think most of the processing engines and storage formats that we work with use epoch in either millis, micros or nanos.

@asfimport
Copy link
Collaborator Author

Wes McKinney / @wesm:
Some other systems define an absolute "timedelta" type consisting of a particular number of days, seconds, milliseconds, microseconds, etc. The unit is fixed, and the timedelta is stored in int64 format

In [9]: import pandas as pd

In [10]: ts = pd.Timedelta(1000, unit='s')

In [11]: ts
Out[11]: Timedelta('0 days 00:16:40')

In [12]: ts.seconds
Out[12]: 1000

In [13]: ts.asm8 # Internal representation
Out[13]: numpy.timedelta64(1000000000000,'ns')

What do you think about this kind of data (it would share the same absolute time units as timestamp, basically)?

@asfimport
Copy link
Collaborator Author

Jacques Nadeau / @jacques-n:
This matches DAY_TIME I believe. The difference is that we are currently fixed to four bytes, right?

@asfimport
Copy link
Collaborator Author

Wes McKinney / @wesm:
Right – some SQL implementations only have hours/minutes/seconds. Based on the unit metadata (which we need to add I suppose) the storage could be int32 or int64.

@asfimport
Copy link
Collaborator Author

Julien Le Dem / @julienledem:
PR: #144

@asfimport
Copy link
Collaborator Author

Julien Le Dem / @julienledem:
Issue resolved by pull request 144
#144

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants