Skip to content

API: should setitem-with-expansion _ever_ raise? #37774

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jbrockmendel opened this issue Nov 11, 2020 · 1 comment
Closed

API: should setitem-with-expansion _ever_ raise? #37774

jbrockmendel opened this issue Nov 11, 2020 · 1 comment
Labels
API Design Indexing Related to indexing on series/frames, not to indexes themselves Needs Discussion Requires discussion from core team before further action

Comments

@jbrockmendel
Copy link
Member

jbrockmendel commented Nov 11, 2020

ATM we are pretty inconsistent as to when setitem-with-expansion raises. This boils down to idiosyncratic casting rules for Index.insert mentioned in today's dev call.

Examples:

ri = pd.Index(range(6))
ci = pd.CategoricalIndex(["a", "a", "b", "b", "c", "a"])
dti = pd.date_range("2016-01-01", periods=6)
mi = pd.MultiIndex.from_arrays([ri, dti])

ser1 = pd.Series(range(6), index=ci)
ser2 = pd.Series(range(6), index=dti)
ser3 = pd.Series(range(6), index=mi)

>>> ser1.loc["d"] = 10
ValueError: 'fill_value=d' is not present in this Categorical's categories

>>> ser2.loc[4] = 10
TypeError: value should be a 'Timestamp' or 'NaT'. Got 'int' instead.

>>> ser2.loc["foo"]  = 10  # <-- casts to object

>>> ser3.loc[dti[0], dti[0]] = 10
TypeError: int() argument must be a string, a bytes-like object or a number, not 'Timestamp'

>>> ser3.loc[3, 4] = 10
TypeError: value should be a 'Timestamp' or 'NaT'. Got 'int' instead.

>>> ser3.loc[3, "a"] = 10  # <-- casts level to object

I see two options:

  1. use consistent casting rules for new item being inserted, so allow iff dtype can be retained.
  2. always cast, never raise

Option 2 seems the more user-friendly, and is necessary for e.g. some of our crosstab tests which insert "All". AFAICT that is why DTI/TDI have a special case casting for strings and raising for everything else.

@jbrockmendel jbrockmendel added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Nov 11, 2020
@simonjayhawkins
Copy link
Member

xref #25383 and #34011

@jbrockmendel jbrockmendel added API Design Indexing Related to indexing on series/frames, not to indexes themselves and removed Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Jun 6, 2021
@mroeschke mroeschke added the Needs Discussion Requires discussion from core team before further action label Aug 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Design Indexing Related to indexing on series/frames, not to indexes themselves Needs Discussion Requires discussion from core team before further action
Projects
None yet
Development

No branches or pull requests

3 participants