Skip to content

DateTimeIndex appending with loc inconsistency in handling numpy datetime64 #9516

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
dashesy opened this issue Feb 18, 2015 · 3 comments · Fixed by #9522
Closed

DateTimeIndex appending with loc inconsistency in handling numpy datetime64 #9516

dashesy opened this issue Feb 18, 2015 · 3 comments · Fixed by #9522
Labels
Bug Indexing Related to indexing on series/frames, not to indexes themselves
Milestone

Comments

@dashesy
Copy link
Contributor

dashesy commented Feb 18, 2015

This is in Pandas 0.15.2 (but I also tried it on '0.15.2-148-g484f668'):

df = pd.DataFrame()
df.loc[np.datetime64(datetime.datetime.now()),'one'] = 100
df.loc[np.datetime64(datetime.datetime.now()),'one'] = 100

This is the output:

                               one
2015-02-18 14:50:05.606510     100
1970-01-17 11:37:51.007748755  100

It seems the the first usage of loc correctly works around limitations of datetime64 but I cannot explain the second one.

@shoyer shoyer added Bug Indexing Related to indexing on series/frames, not to indexes themselves labels Feb 19, 2015
@jnmclarty
Copy link
Contributor

I had a quick peek. Inside _NDFrameIndexer._setitem_with_indexer(), this line here: self.obj._data = self.obj.reindex_axis(labels, i)._data is where the labels go from good to bad. The first reference to labels inside NDFrame.reindex_axis(...) is erroneous...

This code might help somebody more senior than I:

from pandas import DataFrame
from numpy import datetime64 as d64
from datetime import datetime as dt

df = DataFrame()

df.loc[d64(dt(2015,1,1,1,1,1)),'C1'] = 'GOOD'
df.loc[d64(dt(2015,1,1,1,1,1)),'C1'] = 'BAD'
df.loc[d64(dt(2015,1,1,1,1,2)),'C1'] = 'sec'
df.loc[d64(dt(2015,1,1,1,2,1)),'C1'] = 'min'
df.loc[d64(dt(2015,1,1,2,1,1)),'C1'] = 'hr'
df.loc[d64(dt(2015,1,2,1,1,1)),'C1'] = 'day'
df.loc[d64(dt(2015,2,1,1,1,1)),'C1'] = 'mth'
>>> df
                             C1
2015-01-01 01:01:01         BAD
1970-01-17 10:27:54.062000  sec
1970-01-17 10:27:54.121000  min
1970-01-17 10:27:57.661000   hr
1970-01-17 10:29:20.461000  day
1970-01-17 11:12:32.461000  mth

Notice the date component dropped, And the month/day/hour/minute/second specified is incrementing hour/minute/... respectively. Also notice that "GOOD" is over-written, by "BAD".

@jreback
Copy link
Contributor

jreback commented Feb 19, 2015

@jnmclarty this is now fixed in master

@jnmclarty
Copy link
Contributor

@jreback Yah, saw that. Good stuff.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Indexing Related to indexing on series/frames, not to indexes themselves
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants