Skip to content

ENH/BUG: Period and PeriodIndex ops supports timedelta-like #7966

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 10, 2014
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
44 changes: 43 additions & 1 deletion doc/source/timeseries.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
.. ipython:: python
:suppress:

from datetime import datetime
from datetime import datetime, timedelta
import numpy as np
np.random.seed(123456)
from pandas import *
Expand Down Expand Up @@ -1098,6 +1098,36 @@ frequency.

p - 3

If ``Period`` freq is daily or higher (``D``, ``H``, ``T``, ``S``, ``L``, ``U``, ``N``), ``offsets`` and ``timedelta``-like can be added if the result can have same freq. Otherise, ``ValueError`` will be raised.

.. ipython:: python

p = Period('2014-07-01 09:00', freq='H')
p + Hour(2)
p + timedelta(minutes=120)
p + np.timedelta64(7200, 's')

.. code-block:: python

In [1]: p + Minute(5)
Traceback
...
ValueError: Input has different freq from Period(freq=H)

If ``Period`` has other freqs, only the same ``offsets`` can be added. Otherwise, ``ValueError`` will be raised.

.. ipython:: python

p = Period('2014-07', freq='M')
p + MonthEnd(3)

.. code-block:: python

In [1]: p + MonthBegin(3)
Traceback
...
ValueError: Input has different freq from Period(freq=M)

Taking the difference of ``Period`` instances with the same frequency will
return the number of frequency units between them:

Expand Down Expand Up @@ -1129,6 +1159,18 @@ objects:
ps = Series(randn(len(prng)), prng)
ps

``PeriodIndex`` supports addition and subtraction as the same rule as ``Period``.

.. ipython:: python

idx = period_range('2014-07-01 09:00', periods=5, freq='H')
idx
idx + Hour(2)

idx = period_range('2014-07', periods=5, freq='M')
idx
idx + MonthEnd(3)

PeriodIndex Partial String Indexing
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Expand Down
15 changes: 15 additions & 0 deletions doc/source/v0.15.0.txt
Original file line number Diff line number Diff line change
Expand Up @@ -271,10 +271,21 @@ Enhancements



- ``Period`` and ``PeriodIndex`` supports addition/subtraction with ``timedelta``-likes (:issue:`7966`)

If ``Period`` freq is ``D``, ``H``, ``T``, ``S``, ``L``, ``U``, ``N``, ``timedelta``-like can be added if the result can have same freq. Otherwise, only the same ``offsets`` can be added.

.. ipython:: python

idx = pd.period_range('2014-07-01 09:00', periods=5, freq='H')
idx
idx + pd.offsets.Hour(2)
idx + timedelta(minutes=120)
idx + np.timedelta64(7200, 's')

idx = pd.period_range('2014-07', periods=5, freq='M')
idx
idx + pd.offsets.MonthEnd(3)



Expand Down Expand Up @@ -414,6 +425,10 @@ Bug Fixes



- ``Period`` and ``PeriodIndex`` addition/subtraction with ``np.timedelta64`` results in incorrect internal representations (:issue:`7740`)






Expand Down
126 changes: 119 additions & 7 deletions pandas/tests/test_base.py
Original file line number Diff line number Diff line change
Expand Up @@ -915,6 +915,8 @@ def test_resolution(self):
self.assertEqual(idx.resolution, expected)

def test_add_iadd(self):
tm._skip_if_not_numpy17_friendly()

# union
rng1 = pd.period_range('1/1/2000', freq='D', periods=5)
other1 = pd.period_range('1/6/2000', freq='D', periods=5)
Expand Down Expand Up @@ -968,11 +970,64 @@ def test_add_iadd(self):
tm.assert_index_equal(rng, expected)

# offset
for delta in [pd.offsets.Hour(2), timedelta(hours=2)]:
rng = pd.period_range('2000-01-01', '2000-02-01')
with tm.assertRaisesRegexp(TypeError, 'unsupported operand type\(s\)'):
# DateOffset
rng = pd.period_range('2014', '2024', freq='A')
result = rng + pd.offsets.YearEnd(5)
expected = pd.period_range('2019', '2029', freq='A')
tm.assert_index_equal(result, expected)
rng += pd.offsets.YearEnd(5)
tm.assert_index_equal(rng, expected)

for o in [pd.offsets.YearBegin(2), pd.offsets.MonthBegin(1), pd.offsets.Minute(),
np.timedelta64(365, 'D'), timedelta(365)]:
with tm.assertRaisesRegexp(ValueError, 'Input has different freq from Period'):
rng + o

rng = pd.period_range('2014-01', '2016-12', freq='M')
result = rng + pd.offsets.MonthEnd(5)
expected = pd.period_range('2014-06', '2017-05', freq='M')
tm.assert_index_equal(result, expected)
rng += pd.offsets.MonthEnd(5)
tm.assert_index_equal(rng, expected)

for o in [pd.offsets.YearBegin(2), pd.offsets.MonthBegin(1), pd.offsets.Minute(),
np.timedelta64(365, 'D'), timedelta(365)]:
rng = pd.period_range('2014-01', '2016-12', freq='M')
with tm.assertRaisesRegexp(ValueError, 'Input has different freq from Period'):
rng + o

# Tick
offsets = [pd.offsets.Day(3), timedelta(days=3), np.timedelta64(3, 'D'),
pd.offsets.Hour(72), timedelta(minutes=60*24*3), np.timedelta64(72, 'h')]
for delta in offsets:
rng = pd.period_range('2014-05-01', '2014-05-15', freq='D')
result = rng + delta
expected = pd.period_range('2014-05-04', '2014-05-18', freq='D')
tm.assert_index_equal(result, expected)
rng += delta
tm.assert_index_equal(rng, expected)

for o in [pd.offsets.YearBegin(2), pd.offsets.MonthBegin(1), pd.offsets.Minute(),
np.timedelta64(4, 'h'), timedelta(hours=23)]:
rng = pd.period_range('2014-05-01', '2014-05-15', freq='D')
with tm.assertRaisesRegexp(ValueError, 'Input has different freq from Period'):
rng + o

offsets = [pd.offsets.Hour(2), timedelta(hours=2), np.timedelta64(2, 'h'),
pd.offsets.Minute(120), timedelta(minutes=120), np.timedelta64(120, 'm')]
for delta in offsets:
rng = pd.period_range('2014-01-01 10:00', '2014-01-05 10:00', freq='H')
result = rng + delta
expected = pd.period_range('2014-01-01 12:00', '2014-01-05 12:00', freq='H')
tm.assert_index_equal(result, expected)
rng += delta
tm.assert_index_equal(rng, expected)

for delta in [pd.offsets.YearBegin(2), timedelta(minutes=30), np.timedelta64(30, 's')]:
rng = pd.period_range('2014-01-01 10:00', '2014-01-05 10:00', freq='H')
with tm.assertRaisesRegexp(ValueError, 'Input has different freq from Period'):
result = rng + delta
with tm.assertRaisesRegexp(TypeError, 'unsupported operand type\(s\)'):
with tm.assertRaisesRegexp(ValueError, 'Input has different freq from Period'):
rng += delta

# int
Expand All @@ -984,6 +1039,8 @@ def test_add_iadd(self):
tm.assert_index_equal(rng, expected)

def test_sub_isub(self):
tm._skip_if_not_numpy17_friendly()

# diff
rng1 = pd.period_range('1/1/2000', freq='D', periods=5)
other1 = pd.period_range('1/6/2000', freq='D', periods=5)
Expand Down Expand Up @@ -1027,10 +1084,65 @@ def test_sub_isub(self):
tm.assert_index_equal(rng, expected)

# offset
for delta in [pd.offsets.Hour(2), timedelta(hours=2)]:
with tm.assertRaisesRegexp(TypeError, 'unsupported operand type\(s\)'):
# DateOffset
rng = pd.period_range('2014', '2024', freq='A')
result = rng - pd.offsets.YearEnd(5)
expected = pd.period_range('2009', '2019', freq='A')
tm.assert_index_equal(result, expected)
rng -= pd.offsets.YearEnd(5)
tm.assert_index_equal(rng, expected)

for o in [pd.offsets.YearBegin(2), pd.offsets.MonthBegin(1), pd.offsets.Minute(),
np.timedelta64(365, 'D'), timedelta(365)]:
rng = pd.period_range('2014', '2024', freq='A')
with tm.assertRaisesRegexp(ValueError, 'Input has different freq from Period'):
rng - o

rng = pd.period_range('2014-01', '2016-12', freq='M')
result = rng - pd.offsets.MonthEnd(5)
expected = pd.period_range('2013-08', '2016-07', freq='M')
tm.assert_index_equal(result, expected)
rng -= pd.offsets.MonthEnd(5)
tm.assert_index_equal(rng, expected)

for o in [pd.offsets.YearBegin(2), pd.offsets.MonthBegin(1), pd.offsets.Minute(),
np.timedelta64(365, 'D'), timedelta(365)]:
rng = pd.period_range('2014-01', '2016-12', freq='M')
with tm.assertRaisesRegexp(ValueError, 'Input has different freq from Period'):
rng - o

# Tick
offsets = [pd.offsets.Day(3), timedelta(days=3), np.timedelta64(3, 'D'),
pd.offsets.Hour(72), timedelta(minutes=60*24*3), np.timedelta64(72, 'h')]
for delta in offsets:
rng = pd.period_range('2014-05-01', '2014-05-15', freq='D')
result = rng - delta
expected = pd.period_range('2014-04-28', '2014-05-12', freq='D')
tm.assert_index_equal(result, expected)
rng -= delta
tm.assert_index_equal(rng, expected)

for o in [pd.offsets.YearBegin(2), pd.offsets.MonthBegin(1), pd.offsets.Minute(),
np.timedelta64(4, 'h'), timedelta(hours=23)]:
rng = pd.period_range('2014-05-01', '2014-05-15', freq='D')
with tm.assertRaisesRegexp(ValueError, 'Input has different freq from Period'):
rng - o

offsets = [pd.offsets.Hour(2), timedelta(hours=2), np.timedelta64(2, 'h'),
pd.offsets.Minute(120), timedelta(minutes=120), np.timedelta64(120, 'm')]
for delta in offsets:
rng = pd.period_range('2014-01-01 10:00', '2014-01-05 10:00', freq='H')
result = rng - delta
expected = pd.period_range('2014-01-01 08:00', '2014-01-05 08:00', freq='H')
tm.assert_index_equal(result, expected)
rng -= delta
tm.assert_index_equal(rng, expected)

for delta in [pd.offsets.YearBegin(2), timedelta(minutes=30), np.timedelta64(30, 's')]:
rng = pd.period_range('2014-01-01 10:00', '2014-01-05 10:00', freq='H')
with tm.assertRaisesRegexp(ValueError, 'Input has different freq from Period'):
result = rng + delta
with tm.assertRaisesRegexp(TypeError, 'unsupported operand type\(s\)'):
with tm.assertRaisesRegexp(ValueError, 'Input has different freq from Period'):
rng += delta

# int
Expand Down
58 changes: 54 additions & 4 deletions pandas/tseries/period.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# pylint: disable=E1101,E1103,W0232
import operator

from datetime import datetime, date
from datetime import datetime, date, timedelta
import numpy as np
from pandas.core.base import PandasObject

Expand All @@ -10,6 +10,7 @@
from pandas.tseries.index import DatetimeIndex, Int64Index, Index
from pandas.core.base import DatetimeIndexOpsMixin
from pandas.tseries.tools import parse_time_string
import pandas.tseries.offsets as offsets

import pandas.core.common as com
from pandas.core.common import (isnull, _INT64_DTYPE, _maybe_box,
Expand Down Expand Up @@ -169,8 +170,37 @@ def __ne__(self, other):
def __hash__(self):
return hash((self.ordinal, self.freq))

def _add_delta(self, other):
if isinstance(other, (timedelta, np.timedelta64, offsets.Tick)):
offset = frequencies.to_offset(self.freq)
if isinstance(offset, offsets.Tick):
nanos = tslib._delta_to_nanoseconds(other)
offset_nanos = tslib._delta_to_nanoseconds(offset)

if nanos % offset_nanos == 0:
if self.ordinal == tslib.iNaT:
ordinal = self.ordinal
else:
ordinal = self.ordinal + (nanos // offset_nanos)
return Period(ordinal=ordinal, freq=self.freq)
elif isinstance(other, offsets.DateOffset):
freqstr = frequencies.get_standard_freq(other)
base = frequencies.get_base_alias(freqstr)

if base == self.freq:
if self.ordinal == tslib.iNaT:
ordinal = self.ordinal
else:
ordinal = self.ordinal + other.n
return Period(ordinal=ordinal, freq=self.freq)

raise ValueError("Input has different freq from Period(freq={0})".format(self.freq))

def __add__(self, other):
if com.is_integer(other):
if isinstance(other, (timedelta, np.timedelta64,
offsets.Tick, offsets.DateOffset)):
return self._add_delta(other)
elif com.is_integer(other):
if self.ordinal == tslib.iNaT:
ordinal = self.ordinal
else:
Expand All @@ -180,13 +210,17 @@ def __add__(self, other):
return NotImplemented

def __sub__(self, other):
if com.is_integer(other):
if isinstance(other, (timedelta, np.timedelta64,
offsets.Tick, offsets.DateOffset)):
neg_other = -other
return self + neg_other
elif com.is_integer(other):
if self.ordinal == tslib.iNaT:
ordinal = self.ordinal
else:
ordinal = self.ordinal - other
return Period(ordinal=ordinal, freq=self.freq)
if isinstance(other, Period):
elif isinstance(other, Period):
if other.freq != self.freq:
raise ValueError("Cannot do arithmetic with "
"non-conforming periods")
Expand Down Expand Up @@ -862,6 +896,22 @@ def to_timestamp(self, freq=None, how='start'):
new_data = tslib.periodarr_to_dt64arr(new_data.values, base)
return DatetimeIndex(new_data, freq='infer', name=self.name)

def _add_delta(self, other):
if isinstance(other, (timedelta, np.timedelta64, offsets.Tick)):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, does it make sense to push _add_delta (and DatetimeIndex._add_delta) to base into order to combine them (its currently marked as non-implemented in DatetimeIndexOpsMixin)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel this gets more complicated, because each _add_delta handles its internal representation (UTC time in DatetimeIndex and ordinal in PeriodIndex).
https://github.com/pydata/pandas/blob/master/pandas/tseries/index.py#L615

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, not a big deal (always like to have centralized stuff like this), but sometimes makes too complicated. if in the future you think can integrate then ok.

offset = frequencies.to_offset(self.freq)
if isinstance(offset, offsets.Tick):
nanos = tslib._delta_to_nanoseconds(other)
offset_nanos = tslib._delta_to_nanoseconds(offset)
if nanos % offset_nanos == 0:
return self.shift(nanos // offset_nanos)
elif isinstance(other, offsets.DateOffset):
freqstr = frequencies.get_standard_freq(other)
base = frequencies.get_base_alias(freqstr)

if base == self.freq:
return self.shift(other.n)
raise ValueError("Input has different freq from PeriodIndex(freq={0})".format(self.freq))

def shift(self, n):
"""
Specialized shift which produces an PeriodIndex
Expand Down
Loading