Skip to content

pandas.tslib.normalize_date() can put Timestamp into inconsistent state #10663

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
zbak opened this issue Jul 23, 2015 · 5 comments
Closed

pandas.tslib.normalize_date() can put Timestamp into inconsistent state #10663

zbak opened this issue Jul 23, 2015 · 5 comments
Labels
Bug Timezones Timezone data dtype

Comments

@zbak
Copy link

zbak commented Jul 23, 2015

Code and Error description

If I try to normalize a date on daylight saving date day, the normalize_date() can put a Timestamp object into inconsistent state.

>>> import pandas as pd
>>> original_midnight = pd.Timestamp('20121104', tz='US/Eastern')
>>> original_midday = pd.Timestamp('20121104T120000', tz='US/Eastern')

>>> str(pd.tslib.normalize_date(original_midday))
'2012-11-04 00:00:00-05:00'

>>> str(original_midnight)
'2012-11-04 00:00:00-04:00'

>>> pd.tslib.normalize_date(original_midday) == original_midnight
False

>>> pd.tslib.normalize_date(original_midday).tzinfo
<DstTzInfo 'US/Eastern' EST-1 day, 19:00:00 STD>
>>> original_midnight.tzinfo
<DstTzInfo 'US/Eastern' EDT-1 day, 20:00:00 DST>

according to the implementation It only replaces the with zeroes the time part.

I believe either the replace should be timezone aware or the function should look like this:

if PyDateTime_Check(dt):
    return pd.Timestamp(dt.date(), tz=dt.tz)
elif PyDate_Check(dt):
    return datetime(dt.year, dt.month, dt.day)
else:
    raise TypeError('Unrecognized type: %s' % type(dt))

INSTALLED VERSIONS


commit: None
python: 2.7.6.final.0
python-bits: 64
OS: Linux
OS-release: 3.16.0-44-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8

pandas: 0.16.2
nose: 1.3.0
Cython: 0.22.1
numpy: 1.9.2
scipy: 0.15.1
statsmodels: 0.6.1
IPython: 3.2.0
sphinx: None
patsy: 0.2.1
dateutil: 2.4.2
pytz: 2015.4
bottleneck: 0.8.0
tables: 3.0.0
numexpr: 2.3
matplotlib: 1.4.3
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None

@jreback
Copy link
Contributor

jreback commented Jul 23, 2015

This was discussed in #7825 , but this is not technically a dupe (as its slightly different). But I'll let some of the DST experts weight-in (do you believe we have this many people who 'discuss' timezone transitions!)

cc @rockg
cc @sinhrks
cc @adamgreenhall
cc @ischwabacher

@jreback jreback added Bug Timezones Timezone data dtype labels Jul 23, 2015
@jreback jreback added this to the Next Major Release milestone Jul 23, 2015
@jreback
Copy link
Contributor

jreback commented Jul 23, 2015

also would appreciate someone going thru the tracker and create a master issue (just list the issues with a checkbox) and i'll tag, for all of the timestamp normalization issues.

@rockg
Copy link
Contributor

rockg commented Jul 24, 2015

@zbak Use normalize on the Timestamp for what you are doing. It was added specifically to fix this.

@zbak
Copy link
Author

zbak commented Jul 24, 2015

Thank you very much

@jreback jreback closed this as completed Jul 24, 2015
@jreback
Copy link
Contributor

jreback commented Jul 24, 2015

still would love someone to create a master issue!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Timezones Timezone data dtype
Projects
None yet
Development

No branches or pull requests

3 participants