Skip to content

Methods on an PeriodIndex that return an empty set don't return a PeriodIndex object #10596

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
max-sixty opened this issue Jul 15, 2015 · 3 comments
Labels
Compat pandas objects compatability with Numpy or Python functions Period Period data type
Milestone

Comments

@max-sixty
Copy link
Contributor

In [27]: period_index=pd.PeriodIndex(start='2015-01-01',end='2015-03',freq='B')
period_index
Out[27]: <class 'pandas.tseries.period.PeriodIndex'>
[2015-01-01, ..., 2015-03-02]
Length: 43, Freq: B

In [28]: period_index.difference(period_index)
Out[28]: Index([], dtype='object')

I think this should return an empty PeriodIndex object, not an empty Index object.

This happens because if there is an empty set as a result of difference, the object doesn't check its type before creating an empty version of itself: https://github.com/pydata/pandas/blob/v0.16.0/pandas/core/index.py#L1360. Generally I've seen a better construction for that line be type(self)([]).

I'm happy to make this PR, although I'm not sure whether I'm missing something on the intention. If I'm not, should this be executed by adding something in Index's difference method, or overriding that method in PeriodIndex?
A method that removed items from the index would avoid any subclass-specific code, but the drop method also has some odd behavior:

In [25]:
period_index.drop(period_index)
Out[25]:
Int64Index([], dtype='int64')

So if you're creating a new object, you'd need to check the freq of the PeriodIndex too, given an empty PeriodIndex constructor needs a freq. Something like type(self)([], freq=self.freq, name=self.name). Are there cases for other subclasses of Index?

@jreback
Copy link
Contributor

jreback commented Jul 15, 2015

PeriodIndex for sure has some of these types of issues (people have been slowly working there way thru). Almost all other construction is centrally done, but still possible for edge cases.

The constructors should almost always be

self._shallow_copy(....) (you can pass a new values as the first arg, the meta data will be propogated

@jreback jreback added Period Period data type Compat pandas objects compatability with Numpy or Python functions labels Jul 15, 2015
@jreback jreback added this to the 0.17.0 milestone Jul 15, 2015
@sinhrks
Copy link
Member

sinhrks commented Jul 17, 2015

sym_diff is also affected.

period_index.sym_diff(period_index)
Index([], dtype='object')

union and intersection should work for empty PeriodIndex, but nice to have explicit tests.

pidx
# PeriodIndex([], dtype='int64', freq='M')

pidx.union(pidx)
# PeriodIndex([], dtype='int64', freq='M')
pidx.intersection(pidx)
# PeriodIndex([], dtype='int64', freq='M')

@jreback
Copy link
Contributor

jreback commented Jul 28, 2015

closed by #10599

@jreback jreback closed this as completed Jul 28, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Compat pandas objects compatability with Numpy or Python functions Period Period data type
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants