Skip to content

REF: make DatetimeIndex._simple_new actually simple #32282

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 39 commits into from
Mar 14, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
6389872
REF: require DTA in DTI._simple_new
jbrockmendel Jan 17, 2020
2329008
Merge branch 'master' of https://github.com/pandas-dev/pandas into si…
jbrockmendel Jan 17, 2020
91a0937
Merge branch 'master' of https://github.com/pandas-dev/pandas into si…
jbrockmendel Jan 18, 2020
8924c55
Merge branch 'master' of https://github.com/pandas-dev/pandas into si…
jbrockmendel Jan 18, 2020
b55e8b1
Merge branch 'master' of https://github.com/pandas-dev/pandas into si…
jbrockmendel Jan 20, 2020
d04f3b5
Merge branch 'master' of https://github.com/pandas-dev/pandas into si…
jbrockmendel Jan 21, 2020
3a2c228
Merge branch 'master' of https://github.com/pandas-dev/pandas into si…
jbrockmendel Jan 21, 2020
b6ffc2a
Merge branch 'master' of https://github.com/pandas-dev/pandas into si…
jbrockmendel Jan 23, 2020
5b0bac8
Merge branch 'master' of https://github.com/pandas-dev/pandas into si…
jbrockmendel Jan 25, 2020
7405488
Merge branch 'master' of https://github.com/pandas-dev/pandas into si…
jbrockmendel Jan 26, 2020
13eb9a7
Merge branch 'master' of https://github.com/pandas-dev/pandas into si…
jbrockmendel Jan 26, 2020
aadedfd
Merge branch 'master' of https://github.com/pandas-dev/pandas into si…
jbrockmendel Jan 29, 2020
9f220f5
stricter
jbrockmendel Jan 29, 2020
1451163
Merge branch 'master' of https://github.com/pandas-dev/pandas into si…
jbrockmendel Jan 29, 2020
e8ad19a
Merge branch 'master' of https://github.com/pandas-dev/pandas into si…
jbrockmendel Feb 1, 2020
e8483f8
Merge branch 'master' of https://github.com/pandas-dev/pandas into si…
jbrockmendel Feb 4, 2020
f82f389
Merge branch 'master' of https://github.com/pandas-dev/pandas into si…
jbrockmendel Feb 7, 2020
5f22827
Merge branch 'master' of https://github.com/pandas-dev/pandas into si…
jbrockmendel Feb 9, 2020
9a5360a
Merge branch 'master' of https://github.com/pandas-dev/pandas into si…
jbrockmendel Feb 9, 2020
1d9cea3
Merge branch 'master' of https://github.com/pandas-dev/pandas into si…
jbrockmendel Feb 12, 2020
1acb5a3
Merge branch 'master' of https://github.com/pandas-dev/pandas into si…
jbrockmendel Feb 22, 2020
d92ff5f
Merge branch 'master' of https://github.com/pandas-dev/pandas into si…
jbrockmendel Feb 25, 2020
34b4dec
Merge branch 'master' of https://github.com/pandas-dev/pandas into si…
jbrockmendel Feb 25, 2020
242695c
Merge branch 'master' of https://github.com/pandas-dev/pandas into si…
jbrockmendel Feb 26, 2020
50d1c6c
Merge branch 'master' of https://github.com/pandas-dev/pandas into si…
jbrockmendel Feb 26, 2020
8a90f45
Merge branch 'master' of https://github.com/pandas-dev/pandas into si…
jbrockmendel Feb 26, 2020
0e7ffe1
blackify
jbrockmendel Feb 26, 2020
5cba722
remove attrs
jbrockmendel Feb 26, 2020
6efd065
CLN: DTI/TDI _simple_new
jbrockmendel Feb 26, 2020
893d0aa
Merge branch 'master' of https://github.com/pandas-dev/pandas into si…
jbrockmendel Feb 26, 2020
d659a23
simplify apply_index treatment
jbrockmendel Feb 27, 2020
864b604
Merge branch 'master' of https://github.com/pandas-dev/pandas into si…
jbrockmendel Feb 27, 2020
743a003
Merge branch 'master' of https://github.com/pandas-dev/pandas into si…
jbrockmendel Feb 27, 2020
82c5645
remove outdated docstring
jbrockmendel Feb 27, 2020
e91814d
Merge branch 'master' of https://github.com/pandas-dev/pandas into si…
jbrockmendel Mar 11, 2020
10a2ff2
Merge branch 'master' of https://github.com/pandas-dev/pandas into si…
jbrockmendel Mar 12, 2020
c8c3ebb
Merge branch 'master' of https://github.com/pandas-dev/pandas into si…
jbrockmendel Mar 12, 2020
711987e
Merge branch 'master' of https://github.com/pandas-dev/pandas into si…
jbrockmendel Mar 13, 2020
e74ebc0
Comments, collect offsets wrapping in apply_index_wrap
jbrockmendel Mar 13, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 12 additions & 1 deletion pandas/_libs/tslibs/offsets.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -114,7 +114,18 @@ def apply_index_wraps(func):
# Note: normally we would use `@functools.wraps(func)`, but this does
# not play nicely with cython class methods
def wrapper(self, other):
result = func(self, other)

is_index = getattr(other, "_typ", "") == "datetimeindex"

# operate on DatetimeArray
arr = other._data if is_index else other

result = func(self, arr)

if is_index:
# Wrap DatetimeArray result back to DatetimeIndex
result = type(other)._simple_new(result, name=other.name)

if self.normalize:
result = result.to_period('D').to_timestamp()
return result
Expand Down
4 changes: 1 addition & 3 deletions pandas/core/indexes/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -3281,13 +3281,11 @@ def reindex(self, target, method=None, level=None, limit=None, tolerance=None):
target = _ensure_has_len(target) # target may be an iterator

if not isinstance(target, Index) and len(target) == 0:
attrs = self._get_attributes_dict()
attrs.pop("freq", None) # don't preserve freq
if isinstance(self, ABCRangeIndex):
values = range(0)
else:
values = self._data[:0] # appropriately-dtyped empty array
target = self._simple_new(values, **attrs)
target = self._simple_new(values, name=self.name)
else:
target = ensure_index(target)

Expand Down
22 changes: 9 additions & 13 deletions pandas/core/indexes/datetimelike.py
Original file line number Diff line number Diff line change
Expand Up @@ -622,21 +622,11 @@ def _shallow_copy(self, values=None, name: Label = lib.no_default):
if values is None:
values = self._data

if isinstance(values, type(self)):
values = values._data
if isinstance(values, np.ndarray):
# TODO: We would rather not get here
values = type(self._data)(values, dtype=self.dtype)

attributes = self._get_attributes_dict()

if self.freq is not None:
if isinstance(values, (DatetimeArray, TimedeltaArray)):
if values.freq is None:
del attributes["freq"]

attributes["name"] = name
result = self._simple_new(values, **attributes)
result = type(self)._simple_new(values, name=name)
result._cache = cache
return result

Expand Down Expand Up @@ -780,7 +770,10 @@ def _fast_union(self, other, sort=None):
loc = right.searchsorted(left_start, side="left")
right_chunk = right.values[:loc]
dates = concat_compat((left.values, right_chunk))
return self._shallow_copy(dates)
result = self._shallow_copy(dates)
result._set_freq("infer")
# TODO: can we infer that it has self.freq?
return result
else:
left, right = other, self

Expand All @@ -792,7 +785,10 @@ def _fast_union(self, other, sort=None):
loc = right.searchsorted(left_end, side="right")
right_chunk = right.values[loc:]
dates = concat_compat((left.values, right_chunk))
return self._shallow_copy(dates)
result = self._shallow_copy(dates)
result._set_freq("infer")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could change _set_freq to return self to make this idiom more clear (also _set_freq should be marked in the doc-string as a mutating function)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Holding off on this since _set_freq needs a bigger rethink, xref #31218.

All other comments addressed, i think

# TODO: can we infer that it has self.freq?
return result
else:
return left

Expand Down
49 changes: 20 additions & 29 deletions pandas/core/indexes/datetimes.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,17 +7,13 @@

from pandas._libs import NaT, Period, Timestamp, index as libindex, lib, tslib as libts
from pandas._libs.tslibs import fields, parsing, timezones
from pandas._typing import Label
from pandas.util._decorators import cache_readonly

from pandas.core.dtypes.common import _NS_DTYPE, is_float, is_integer, is_scalar
from pandas.core.dtypes.dtypes import DatetimeTZDtype
from pandas.core.dtypes.missing import is_valid_nat_for_dtype

from pandas.core.arrays.datetimes import (
DatetimeArray,
tz_to_dtype,
validate_tz_from_dtype,
)
from pandas.core.arrays.datetimes import DatetimeArray, tz_to_dtype
import pandas.core.common as com
from pandas.core.indexes.base import Index, InvalidIndexError, maybe_extract_name
from pandas.core.indexes.datetimelike import DatetimeTimedeltaMixin
Expand All @@ -36,7 +32,20 @@ def _new_DatetimeIndex(cls, d):
if "data" in d and not isinstance(d["data"], DatetimeIndex):
# Avoid need to verify integrity by calling simple_new directly
data = d.pop("data")
result = cls._simple_new(data, **d)
if not isinstance(data, DatetimeArray):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is all this about?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

backward compat for old pickles

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you make this nicer. it is very hard to grok.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls comment the cases

# For backward compat with older pickles, we may need to construct
# a DatetimeArray to adapt to the newer _simple_new signature
tz = d.pop("tz")
freq = d.pop("freq")
dta = DatetimeArray._simple_new(data, dtype=tz_to_dtype(tz), freq=freq)
else:
dta = data
for key in ["tz", "freq"]:
# These are already stored in our DatetimeArray; if they are
# also in the pickle and don't match, we have a problem.
if key in d:
assert d.pop(key) == getattr(dta, key)
result = cls._simple_new(dta, **d)
else:
with warnings.catch_warnings():
# TODO: If we knew what was going in to **d, we might be able to
Expand Down Expand Up @@ -244,34 +253,16 @@ def __new__(
return subarr

@classmethod
def _simple_new(cls, values, name=None, freq=None, tz=None, dtype=None):
"""
We require the we have a dtype compat for the values
if we are passed a non-dtype compat, then coerce using the constructor
"""
if isinstance(values, DatetimeArray):
if tz:
tz = validate_tz_from_dtype(dtype, tz)
dtype = DatetimeTZDtype(tz=tz)
elif dtype is None:
dtype = _NS_DTYPE

values = DatetimeArray(values, freq=freq, dtype=dtype)
tz = values.tz
freq = values.freq
values = values._data

dtype = tz_to_dtype(tz)
dtarr = DatetimeArray._simple_new(values, freq=freq, dtype=dtype)
assert isinstance(dtarr, DatetimeArray)
def _simple_new(cls, values: DatetimeArray, name: Label = None):
assert isinstance(values, DatetimeArray), type(values)

result = object.__new__(cls)
result._data = dtarr
result._data = values
result.name = name
result._cache = {}
result._no_setting_name = False
# For groupby perf. See note in indexes/base about _index_data
result._index_data = dtarr._data
result._index_data = values._data
result._reset_identity()
return result

Expand Down
9 changes: 3 additions & 6 deletions pandas/core/indexes/timedeltas.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
""" implement the TimedeltaIndex """

from pandas._libs import NaT, Timedelta, index as libindex
from pandas._typing import Label
from pandas.util._decorators import Appender

from pandas.core.dtypes.common import (
Expand Down Expand Up @@ -154,7 +155,7 @@ def __new__(
if isinstance(data, TimedeltaArray) and freq is None:
if copy:
data = data.copy()
return cls._simple_new(data, name=name, freq=freq)
return cls._simple_new(data, name=name)

if isinstance(data, TimedeltaIndex) and freq is None and name is None:
if copy:
Expand All @@ -170,12 +171,8 @@ def __new__(
return cls._simple_new(tdarr, name=name)

@classmethod
def _simple_new(cls, values, name=None, freq=None, dtype=_TD_DTYPE):
# `dtype` is passed by _shallow_copy in corner cases, should always
# be timedelta64[ns] if present
assert dtype == _TD_DTYPE, dtype
def _simple_new(cls, values: TimedeltaArray, name: Label = None):
assert isinstance(values, TimedeltaArray)
assert freq is None or values.freq == freq

result = object.__new__(cls)
result._data = values
Expand Down
4 changes: 2 additions & 2 deletions pandas/tests/arrays/test_datetimelike.py
Original file line number Diff line number Diff line change
Expand Up @@ -65,8 +65,8 @@ def test_compare_len1_raises(self):
# to the case where one has length-1, which numpy would broadcast
data = np.arange(10, dtype="i8") * 24 * 3600 * 10 ** 9

idx = self.array_cls._simple_new(data, freq="D")
arr = self.index_cls(idx)
arr = self.array_cls._simple_new(data, freq="D")
idx = self.index_cls(arr)

with pytest.raises(ValueError, match="Lengths must match"):
arr == arr[:1]
Expand Down
2 changes: 1 addition & 1 deletion pandas/tests/indexes/datetimes/test_ops.py
Original file line number Diff line number Diff line change
Expand Up @@ -363,7 +363,7 @@ def test_equals(self):
assert not idx.equals(pd.Series(idx2))

# same internal, different tz
idx3 = pd.DatetimeIndex._simple_new(idx.asi8, tz="US/Pacific")
idx3 = pd.DatetimeIndex(idx.asi8, tz="US/Pacific")
tm.assert_numpy_array_equal(idx.asi8, idx3.asi8)
assert not idx.equals(idx3)
assert not idx.equals(idx3.copy())
Expand Down
15 changes: 0 additions & 15 deletions pandas/tseries/offsets.py
Original file line number Diff line number Diff line change
Expand Up @@ -337,9 +337,6 @@ def apply_index(self, i):
# integer addition on PeriodIndex is deprecated,
# so we directly use _time_shift instead
asper = i.to_period("W")
if not isinstance(asper._data, np.ndarray):
# unwrap PeriodIndex --> PeriodArray
asper = asper._data
shifted = asper._time_shift(weeks)
i = shifted.to_timestamp() + i.to_perioddelta("W")

Expand Down Expand Up @@ -629,9 +626,6 @@ def apply_index(self, i):
# to_period rolls forward to next BDay; track and
# reduce n where it does when rolling forward
asper = i.to_period("B")
if not isinstance(asper._data, np.ndarray):
# unwrap PeriodIndex --> PeriodArray
asper = asper._data

if self.n > 0:
shifted = (i.to_perioddelta("B") - time).asi8 != 0
Expand Down Expand Up @@ -1384,9 +1378,6 @@ def apply_index(self, i):
# integer-array addition on PeriodIndex is deprecated,
# so we use _addsub_int_array directly
asper = i.to_period("M")
if not isinstance(asper._data, np.ndarray):
# unwrap PeriodIndex --> PeriodArray
asper = asper._data

shifted = asper._addsub_int_array(roll // 2, operator.add)
i = type(dti)(shifted.to_timestamp())
Expand Down Expand Up @@ -1582,9 +1573,6 @@ def apply_index(self, i):
# integer addition on PeriodIndex is deprecated,
# so we use _time_shift directly
asper = i.to_period("W")
if not isinstance(asper._data, np.ndarray):
# unwrap PeriodIndex --> PeriodArray
asper = asper._data

shifted = asper._time_shift(self.n)
return shifted.to_timestamp() + i.to_perioddelta("W")
Expand All @@ -1608,9 +1596,6 @@ def _end_apply_index(self, dtindex):

base, mult = libfrequencies.get_freq_code(self.freqstr)
base_period = dtindex.to_period(base)
if not isinstance(base_period._data, np.ndarray):
# unwrap PeriodIndex --> PeriodArray
base_period = base_period._data

if self.n > 0:
# when adding, dates on end roll to next
Expand Down