Skip to content

API: stop silent conversion of object-Index to DatetimeIndex #49169

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Oct 18, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions doc/source/whatsnew/v2.0.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -127,6 +127,8 @@ Other API changes
- Passing ``dtype`` of "timedelta64[s]", "timedelta64[ms]", or "timedelta64[us]" to :class:`TimedeltaIndex`, :class:`Series`, or :class:`DataFrame` constructors will now retain that dtype instead of casting to "timedelta64[ns]"; passing a dtype with lower resolution for :class:`Series` or :class:`DataFrame` will be cast to the lowest supported resolution "timedelta64[s]" (:issue:`49014`)
- Passing a ``np.datetime64`` object with non-nanosecond resolution to :class:`Timestamp` will retain the input resolution if it is "s", "ms", or "ns"; otherwise it will be cast to the closest supported resolution (:issue:`49008`)
- The ``other`` argument in :meth:`DataFrame.mask` and :meth:`Series.mask` now defaults to ``no_default`` instead of ``np.nan`` consistent with :meth:`DataFrame.where` and :meth:`Series.where`. Entries will be filled with the corresponding NULL value (``np.nan`` for numpy dtypes, ``pd.NA`` for extension dtypes). (:issue:`49111`)
- When creating a :class:`Series` with a object-dtype :class:`Index` of datetime objects, pandas no longer silently converts the index to a :class:`DatetimeIndex` (:issue:`39307`, :issue:`23598`)
-

.. ---------------------------------------------------------------------------
.. _whatsnew_200.deprecations:
Expand Down
5 changes: 5 additions & 0 deletions pandas/core/generic.py
Original file line number Diff line number Diff line change
Expand Up @@ -825,7 +825,12 @@ def _set_axis_nocheck(self, labels, axis: Axis, inplace: bool_t, copy: bool_t):
setattr(obj, obj._get_axis_name(axis), labels)
return obj

@final
def _set_axis(self, axis: AxisInt, labels: AnyArrayLike | list) -> None:
"""
This is called from the cython code when we set the `index` attribute
directly, e.g. `series.index = [1, 2, 3]`.
"""
labels = ensure_index(labels)
self._mgr.set_axis(axis, labels)
self._clear_item_cache()
Expand Down
33 changes: 0 additions & 33 deletions pandas/core/series.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,6 @@
lib,
properties,
reshape,
tslibs,
)
from pandas._libs.lib import no_default
from pandas._typing import (
Expand Down Expand Up @@ -132,13 +131,11 @@
)
from pandas.core.indexes.accessors import CombinedDatetimelikeProperties
from pandas.core.indexes.api import (
CategoricalIndex,
DatetimeIndex,
Float64Index,
Index,
MultiIndex,
PeriodIndex,
TimedeltaIndex,
default_index,
ensure_index,
)
Expand Down Expand Up @@ -570,36 +567,6 @@ def _constructor_expanddim(self) -> Callable[..., DataFrame]:
def _can_hold_na(self) -> bool:
return self._mgr._can_hold_na

def _set_axis(self, axis: AxisInt, labels: AnyArrayLike | list) -> None:
"""
Override generic, we want to set the _typ here.

This is called from the cython code when we set the `index` attribute
directly, e.g. `series.index = [1, 2, 3]`.
"""
labels = ensure_index(labels)

if labels._is_all_dates and not (
type(labels) is Index and not isinstance(labels.dtype, np.dtype)
):
# exclude e.g. timestamp[ns][pyarrow] dtype from this casting
deep_labels = labels
if isinstance(labels, CategoricalIndex):
deep_labels = labels.categories

if not isinstance(
deep_labels, (DatetimeIndex, PeriodIndex, TimedeltaIndex)
):
try:
labels = DatetimeIndex(labels)
except (tslibs.OutOfBoundsDatetime, ValueError):
# labels may exceeds datetime bounds,
# or not be a DatetimeIndex
pass

# The ensure_index call above ensures we have an Index object
self._mgr.set_axis(axis, labels)

# ndarray compatibility
@property
def dtype(self) -> DtypeObj:
Expand Down
2 changes: 1 addition & 1 deletion pandas/io/pytables.py
Original file line number Diff line number Diff line change
Expand Up @@ -3006,7 +3006,7 @@ def read_index_node(
attrs = node._v_attrs
factory, kwargs = self._get_index_factory(attrs)

if kind == "date":
if kind == "date" or kind == "object":
index = factory(
_unconvert_index(
data, kind, encoding=self.encoding, errors=self.errors
Expand Down
4 changes: 3 additions & 1 deletion pandas/tests/series/test_constructors.py
Original file line number Diff line number Diff line change
Expand Up @@ -1986,7 +1986,9 @@ def test_series_constructor_datetimelike_index_coercion(self):
ser = Series(np.random.randn(len(idx)), idx.astype(object))
with tm.assert_produces_warning(FutureWarning):
assert ser.index.is_all_dates
assert isinstance(ser.index, DatetimeIndex)
# as of 2.0, we no longer silently cast the object-dtype index
# to DatetimeIndex GH#39307, GH#23598
assert not isinstance(ser.index, DatetimeIndex)

def test_series_constructor_infer_multiindex(self):
index_lists = [["a", "a", "b", "b"], ["x", "y", "x", "y"]]
Expand Down