Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Timestamp/DTI. to epoch time #14772

Open
jreback opened this issue Nov 30, 2016 · 13 comments
Open

ENH: Timestamp/DTI. to epoch time #14772

jreback opened this issue Nov 30, 2016 · 13 comments
Labels
Datetime Datetime data dtype Enhancement

Comments

@jreback
Copy link
Contributor

jreback commented Nov 30, 2016

add a .to_epoch(unit='s') method to Timestamp and DatetimeIndex that returns the epoch for that unit. I think would default this to s as that seems pretty common, but allow any of our units.

In [19]: s = Series(pd.date_range('20160101',periods=3))

In [20]: s
Out[20]: 
0   2016-01-01
1   2016-01-02
2   2016-01-03
dtype: datetime64[ns]

In [21]: ((s-Timestamp(0)) / Timedelta('1s')).astype('i8')
Out[21]: 
0    1451606400
1    1451692800
2    1451779200
dtype: int64

@jreback jreback added API Design Needs Discussion Requires discussion from core team before further action Datetime Datetime data dtype labels Nov 30, 2016
@jreback
Copy link
Contributor Author

jreback commented Nov 30, 2016

xref #11022, #6741

@jreback
Copy link
Contributor Author

jreback commented Nov 30, 2016

This also works, but exposing internal impl, verbose and not user friendly

In [36]: Series(s.values.astype('datetime64[s]').astype('i8'), index=s.index)
Out[36]: 
0    1451606400
1    1451692800
2    1451779200
dtype: int64

@jorisvandenbossche
Copy link
Member

If we would add user facing functionality, I think I would like to_epoch() most (certainly not something like int64[s] IMO)

@jreback jreback added this to the Next Major Release milestone Mar 8, 2017
@jreback jreback changed the title ENH/DOC: Timestamp/DTI to epoch time ENH: Timestamp/DTI. to epoch time Mar 8, 2017
@jreback
Copy link
Contributor Author

jreback commented Mar 8, 2017

I changed this to make this an enhancement for a simple .to_epoch() method on Timestamp/DTI.

@jreback jreback added Enhancement and removed Needs Discussion Requires discussion from core team before further action labels Mar 8, 2017
@jbrockmendel
Copy link
Member

since timestamp now has timestamp method, should,we use the same name for DTI?

@jreback
Copy link
Contributor Author

jreback commented Oct 31, 2017

yes this would be reasonable (though to be honest the .timestamp() name is not very informative.....

@jorisvandenbossche
Copy link
Member

I also don't really like the name. It is rather confusing given that we already have a Timestamp class (for timestamps itself it is ok to keep subclass consistency). So when adding such a method to DatetimeIndex / dt accessor, I would think about not using the same name.

@jreback
Copy link
Contributor Author

jreback commented Nov 1, 2017

I am partial to to_epoch, we use this term elsewhere.

@TomAugspurger
Copy link
Contributor

TomAugspurger commented May 14, 2018

One question: what to do with NaTs?

In [5]: pd.DatetimeIndex(['2017', '2018', None]).values.astype('datetime64[s]').astype("i8")
Out[5]: array([          1483228800,           1514764800, -9223372036854775808])

Do we value having integer dtype more? I think so in this case.

@TomAugspurger
Copy link
Contributor

TomAugspurger commented May 14, 2018

Second question: timezones. Unix time is defined in UTC, so should we

  • assume that TZ-naive data is UTC, and just localize
  • convert TZ-aware data to UTC

This will necessitate an ambiguous parameter.

@TomAugspurger
Copy link
Contributor

third question: how to handle higher-precision components?

In [6]: pd.DatetimeIndex(['2017-01-01T00:00:00.01', '2017-01-01T00:00:00.02']).to_epoch()
Out[6]: array([1483228800, 1483228800])

I don't think we should use floats and fractional components. So that leaves truncating or rounding to the nearest unit.

@simonjayhawkins
Copy link
Member

One question: what to do with NaTs?

In [5]: pd.DatetimeIndex(['2017', '2018', None]).values.astype('datetime64[s]').astype("i8")
Out[5]: array([          1483228800,           1514764800, -9223372036854775808])

Do we value having integer dtype more? I think so in this case.

could a <IntegerArray> be returned in this case. it would need some casting since currently

pd.array(pd.DatetimeIndex(['2017', '2018', None]).values.astype('datetime64[s]'), dtype='Int64')

raises

TypeError: datetime64[s] cannot be converted to an IntegerDtype

@jreback
Copy link
Contributor Author

jreback commented Jul 12, 2019

yes there are various use cases where we could do things like this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Datetime Datetime data dtype Enhancement
Projects
None yet
Development

No branches or pull requests

6 participants