Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rounding valid timestamps near daylight savings jumps should not throw NonExistentTimeError #23324

Closed
louispotok opened this issue Oct 24, 2018 · 3 comments · Fixed by #23406
Closed
Labels
Bug Datetime Datetime data dtype Timezones Timezone data dtype
Milestone

Comments

@louispotok
Copy link
Contributor

Code Sample, a copy-pastable example if possible

import pandas as pd

# valid time rounds incorrectly
pd.Timestamp('2018-03-11 01:59:00-0600', tz='America/Chicago').round(freq='5min')
# throws NonExistentTimeError: 2018-03-11 02:00:00
# should return pd.Timestamp('2018-03-11 03:00:00-0500', tz='America/Chicago')

Problem description

The timestamp is a valid time, so it should round to another valid timestamp. Specifically, it should round to the nearest 5min frequency valid time, which is the moment after the clock jumps.

Expected Output

pd.Timestamp('2018-03-11 03:00:00-0500', tz='America/Chicago')

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.6.6.final.0
python-bits: 64
OS: Linux
OS-release: 4.4.0-1070-aws
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: C.UTF-8
LOCALE: en_US.UTF-8

pandas: 0.22.0
pytest: None
pip: 18.1
setuptools: 39.2.0
Cython: None
numpy: 1.14.0
scipy: 1.1.0
pyarrow: 0.9.0
xarray: None
IPython: 6.5.0
sphinx: None
patsy: 0.5.0
dateutil: 2.7.3
pytz: 2018.6
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: 0.4.0
matplotlib: 2.1.2
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: 1.0.1
sqlalchemy: 1.2.10
pymysql: None
psycopg2: 2.7.5 (dt dec pq3 ext lo64)
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None

@mroeschke
Copy link
Member

This occurs after relocalization since rounding happens in local time. After #22644 is in, we can introduce a nonexistent keyword to handle this behavior like we did ambiguous in #22647

@mroeschke mroeschke added Bug Datetime Datetime data dtype Timezones Timezone data dtype labels Oct 25, 2018
@louispotok
Copy link
Contributor Author

Thanks for the quick response! Agree that would solve this, but from a design perspective I don't think I understand why that would be the right way to handle it.

What's the usecase where you are trying to round a valid time and you would WANT to raise a NonExistentTimeError? Or even deeper, what would that mean? If you're working within valid times, rounding should always be an allowed operation (to another valid time), right?

@mroeschke
Copy link
Member

Well operationally rounding '2018-03-11 01:59:00-0600' to the closest 5min would yield '2018-03-11 02:00:00' which does not exist hence NonExistentTimeError could be a valid output.

It is inconvenient that this is the only output though, so giving user the ability to 'raise', or replace with 'NaT' or 'shift' to 3:00 (which what you want) would be a nice for the user to control instead of pandas dictating what the correct behavior should be.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Datetime Datetime data dtype Timezones Timezone data dtype
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants