Skip to content

CLN: de-duplicate boxing in DTI.get_value #30819

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jan 9, 2020

Conversation

jbrockmendel
Copy link
Member

Also add a dedicated test or it to hit currently un-covered case of np.datetime64 key. That case works in master, but doesn't go through the expected code path

else:
key = Timestamp(key).tz_localize(self.tz)

if isinstance(key, (datetime, np.datetime64)):
return self.get_value_maybe_box(series, key)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the removed code here is duplicated in get_value_maybe_box

@jbrockmendel
Copy link
Member Author

@jreback @jorisvandenbossche related to this PR I'm trying to de-duplicate the localization going on in _get_value_maybe_box. Back in #17920 it was decided to localize naive-looking strings when doing lookups on a tzaware DTI, but I'm surprised that we are doing the same for datetime objects (and AFAICT also np.datetime64, but we dont test that case)

The relevant test is in tests.series.indexing.test_datetime, pulling out the pertinent part:

rng = date_range("1/1/1990", periods=N, freq="H", tz="US/Eastern")
ts = Series(np.random.randn(N), index=rng)

[...]
# repeat all the above with naive datetimes
result = ts[datetime(1990, 1, 1, 4)]
expected = ts[4]
assert result == expected

When I change the behavior of get_value to not cast tznaive datetime objects, this lookup raises (which I thought was the correct behavior, but apparently not?). Looking at the blame for this test, it has been around for a long time.

Changing this would be non-trivial because it turns out that getitem[scalar] goes through DTI.get_value but slicing and setitem etc go through different paths.

I guess there's not really a question here, just opening it up for comments.

@jreback jreback added Clean Datetime Datetime data dtype labels Jan 9, 2020
@jreback jreback added this to the 1.0 milestone Jan 9, 2020
@jreback
Copy link
Contributor

jreback commented Jan 9, 2020

about your comment: #30819 (comment)

yeah this is the implict conversion of strings only to a local timezone when they are naive (the strings) and the indexing is on timezone aware indexes. note that this does not apply to timestamps which must be explicityly timezoned.

seems reasonable, but feel free to open an issue if more questions.

ideally we should try to limit the indexing paths, but there is a lot going on so not sure how easy this is.

@jreback jreback merged commit 8029ba1 into pandas-dev:master Jan 9, 2020
@jreback
Copy link
Contributor

jreback commented Jan 9, 2020

thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Clean Datetime Datetime data dtype
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants