-
-
Notifications
You must be signed in to change notification settings - Fork 18.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
API/ENH: tz_localize handling of nonexistent times: rename keyword + add shift option #22644
Conversation
Hello @mroeschke! Thanks for submitting the PR.
|
Codecov Report
@@ Coverage Diff @@
## master #22644 +/- ##
==========================================
+ Coverage 92.22% 92.22% +<.01%
==========================================
Files 169 169
Lines 50911 50922 +11
==========================================
+ Hits 46954 46965 +11
Misses 3957 3957
Continue to review full report at Codecov.
|
@jreback re: overlap between
So I would propose that eventually we can depreciate |
lgtm @jorisvandenbossche any more comments? |
@@ -565,6 +565,8 @@ class NaTType(_NaT): | |||
- 'raise' will raise an AmbiguousTimeError for an ambiguous time | |||
|
|||
nonexistent : 'shift', 'NaT', default 'raise' | |||
A nonexistent time doesn't not exist in a particular timezone | |||
where clocks moved forward due to DST. | |||
- 'shift' will shift the nonexistent time forward to the closest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, rst formatting nitpick: there needs to be a blank line between the first sentences, and the start of this list ... (getting rst right can be annoying ..)
Thanks @jorisvandenbossche. Added those blank lines for rendering. |
@pytest.mark.parametrize('tz', ['Europe/Warsaw', 'dateutil/Europe/Warsaw']) | ||
@pytest.mark.parametrize('method, exp', [ | ||
['shift', '2015-03-29 03:00:00'], | ||
['NaT', pd.NaT], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you have tests that exericse the assertion when you pass a nonexistent
keyword that is invalid?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just added a test for an invalid nonexistent
keyword.
pandas/_libs/tslibs/timestamps.pyx
Outdated
@@ -978,14 +979,26 @@ class Timestamp(_Timestamp): | |||
- 'NaT' will return NaT for an ambiguous time | |||
- 'raise' will raise an AmbiguousTimeError for an ambiguous time | |||
|
|||
errors : 'raise', 'coerce', default 'raise' | |||
nonexistent : 'shift', 'NaT', default 'raise' | |||
A nonexistent time doesn't not exist in a particular timezone |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A nonexistent time doesn't not exist in a particular timezone | |
A nonexistent time does not exist in a particular timezone |
pandas/core/arrays/datetimes.py
Outdated
@@ -639,15 +639,27 @@ def tz_localize(self, tz, ambiguous='raise', errors='raise'): | |||
- 'raise' will raise an AmbiguousTimeError if there are ambiguous | |||
times | |||
|
|||
errors : {'raise', 'coerce'}, default 'raise' | |||
nonexistent : 'shift', 'NaT' default 'raise' | |||
A nonexistent time doesn't not exist in a particular timezone |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A nonexistent time doesn't not exist in a particular timezone | |
A nonexistent time does not exist in a particular timezone |
pandas/core/generic.py
Outdated
@@ -8659,6 +8659,17 @@ def tz_localize(self, tz, axis=0, level=None, copy=True, | |||
- 'NaT' will return NaT where there are ambiguous times | |||
- 'raise' will raise an AmbiguousTimeError if there are ambiguous | |||
times | |||
nonexistent : 'shift', 'NaT', default 'raise' | |||
A nonexistent time doesn't not exist in a particular timezone |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A nonexistent time doesn't not exist in a particular timezone | |
A nonexistent time does not exist in a particular timezone |
Thanks for catching that typo @jorisvandenbossche |
one more rebase and I think ok to go |
thanks @mroeschke |
…ndas * repo_org/master: (23 commits) DOC: Add docstring validations for "See Also" section (pandas-dev#23143) TST: Fix test assertion (pandas-dev#23357) BUG: Handle Period in combine (pandas-dev#23350) REF: SparseArray imports (pandas-dev#23329) CI: Migrate some CircleCI jobs to Azure (pandas-dev#22992) DOC: update the is_month_start/is_month_end docstring (pandas-dev#23051) Partialy fix issue pandas-dev#23334 - isort pandas/core/groupby directory (pandas-dev#23341) TST: Add base test for extensionarray setitem pandas-dev#23300 (pandas-dev#23304) API: Add sparse Acessor (pandas-dev#23183) PERF: speed up CategoricalIndex.get_loc (pandas-dev#23235) fix and test incorrect case in delta_to_nanoseconds (pandas-dev#23302) BUG: Handle Datetimelike data in DataFrame.combine (pandas-dev#23317) TST: re-enable gbq tests (pandas-dev#23303) Switched references of App veyor to azure pipelines in the contributing CI section (pandas-dev#23311) isort imports-io (pandas-dev#23332) DOC: Added a Multi Index example for the Series.sum method (pandas-dev#23279) REF: Make PeriodArray an ExtensionArray (pandas-dev#22862) DOC: Added Examples for Series max (pandas-dev#23298) API/ENH: tz_localize handling of nonexistent times: rename keyword + add shift option (pandas-dev#22644) BUG: Let MultiIndex.set_levels accept any iterable (pandas-dev#23273) (pandas-dev#23291) ...
…xamples * repo_org/master: (83 commits) DOC: Add docstring validations for "See Also" section (pandas-dev#23143) TST: Fix test assertion (pandas-dev#23357) BUG: Handle Period in combine (pandas-dev#23350) REF: SparseArray imports (pandas-dev#23329) CI: Migrate some CircleCI jobs to Azure (pandas-dev#22992) DOC: update the is_month_start/is_month_end docstring (pandas-dev#23051) Partialy fix issue pandas-dev#23334 - isort pandas/core/groupby directory (pandas-dev#23341) TST: Add base test for extensionarray setitem pandas-dev#23300 (pandas-dev#23304) API: Add sparse Acessor (pandas-dev#23183) PERF: speed up CategoricalIndex.get_loc (pandas-dev#23235) fix and test incorrect case in delta_to_nanoseconds (pandas-dev#23302) BUG: Handle Datetimelike data in DataFrame.combine (pandas-dev#23317) TST: re-enable gbq tests (pandas-dev#23303) Switched references of App veyor to azure pipelines in the contributing CI section (pandas-dev#23311) isort imports-io (pandas-dev#23332) DOC: Added a Multi Index example for the Series.sum method (pandas-dev#23279) REF: Make PeriodArray an ExtensionArray (pandas-dev#22862) DOC: Added Examples for Series max (pandas-dev#23298) API/ENH: tz_localize handling of nonexistent times: rename keyword + add shift option (pandas-dev#22644) BUG: Let MultiIndex.set_levels accept any iterable (pandas-dev#23273) (pandas-dev#23291) ...
git diff upstream/master -u -- "*.py" | flake8 --diff
Currently, users do not have any control over nonexistent datetime handling when
tz_localize
ing like they do ambiguous times. This adds a new keywordnonexistent
totz_localize
so that users now can:'raise'
: Raise an error (default)'NaT'
: Replace nonexistent times with'NaT'
'shift'
: Shift nonexistent times forward to the closest existing time