Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Restore the functionality of .fillna #59831

Open
1 of 3 tasks
tomprimozic opened this issue Sep 18, 2024 · 3 comments
Open
1 of 3 tasks

ENH: Restore the functionality of .fillna #59831

tomprimozic opened this issue Sep 18, 2024 · 3 comments
Labels
Dtype Conversions Unexpected or buggy dtype conversions Enhancement Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate Needs Discussion Requires discussion from core team before further action Needs Info Clarification about behavior needed to assess issue

Comments

@tomprimozic
Copy link

tomprimozic commented Sep 18, 2024

Feature Type

  • Adding new functionality to pandas

  • Changing existing functionality in pandas

  • Removing existing functionality in pandas

Problem Description

The currently very useful behaviour of .fillna is being deprecated.

a = pd.Series([True, False, None])
a
# 0     True
# 1    False
# 2     None
# dtype: object

Using a.fillna raises a warning:

a.fillna(True)
# [/var/folders/66/bjmfs8315pjdbztwxmyvd7x80000gn/T/ipykernel_85835/3077193745.py:1](https://file+.vscode-resource.vscode-cdn.net/var/folders/66/bjmfs8315pjdbztwxmyvd7x80000gn/T/ipykernel_85835/3077193745.py:1): FutureWarning: Downcasting object dtype arrays on .fillna, .ffill, .bfill is deprecated and will change in a future version. Call result.infer_objects(copy=False) instead. To opt-in to the future behavior, set `pd.set_option('future.no_silent_downcasting', True)`
#   a.fillna(True)

Full message of the warning is:

FutureWarning: Downcasting object dtype arrays on .fillna, .ffill, .bfill is deprecated and will change in a future version. Call result.infer_objects(copy=False) instead. To opt-in to the future behavior, set pd.set_option('future.no_silent_downcasting', True)

The proposed solutions don't work:

a.fillna(True).infer_objects(copy=False)
# [/var/folders/66/bjmfs8315pjdbztwxmyvd7x80000gn/T/ipykernel_85835/1224563968.py:1](https://file+.vscode-resource.vscode-cdn.net/var/folders/66/bjmfs8315pjdbztwxmyvd7x80000gn/T/ipykernel_85835/1224563968.py:1): FutureWarning: Downcasting object dtype arrays on .fillna, .ffill, .bfill is deprecated and will change in a future version. Call result.infer_objects(copy=False) instead. To opt-in to the future behavior, set `pd.set_option('future.no_silent_downcasting', True)`
#   a.fillna(True).infer_objects(copy=False)

maybe I misunderstood the Warning message?

a.infer_objects(copy=False).fillna(True)
# [/var/folders/66/bjmfs8315pjdbztwxmyvd7x80000gn/T/ipykernel_85835/2319989247.py:1](https://file+.vscode-resource.vscode-cdn.net/var/folders/66/bjmfs8315pjdbztwxmyvd7x80000gn/T/ipykernel_85835/2319989247.py:1): FutureWarning: Downcasting object dtype arrays on .fillna, .ffill, .bfill is deprecated and will change in a future version. Call result.infer_objects(copy=False) instead. To opt-in to the future behavior, set `pd.set_option('future.no_silent_downcasting', True)`
#   a.infer_objects(copy=False).fillna(True)    # maybe I misunderstood the message?

Let's try to opt-in...

with pd.option_context("future.no_silent_downcasting", True):
  r = a.fillna(True)
r
# 0     True
# 1    False
# 2     True
# dtype: object

No, it's no longer a bool Series...

Some online resources suggest first casting to bool...

a.astype(bool)
# 0     True
# 1    False
# 2    False
# dtype: bool

Looks like this is a potential replacement for .fillna(False) but not for .fillna(True)...

Wait, there's a downcast parameter for .fillna!

a.fillna(True, downcast=True)
# [/var/folders/66/bjmfs8315pjdbztwxmyvd7x80000gn/T/ipykernel_85835/4257582953.py:1](https://file+.vscode-resource.vscode-cdn.net/var/folders/66/bjmfs8315pjdbztwxmyvd7x80000gn/T/ipykernel_85835/4257582953.py:1): FutureWarning: The 'downcast' keyword in fillna is deprecated and will be removed in a future version. Use res.infer_objects(copy=False) to infer non-object dtype, or pd.to_numeric with the 'downcast' keyword to downcast numeric results.
#   a.fillna(True, downcast=True)
# [/var/folders/66/bjmfs8315pjdbztwxmyvd7x80000gn/T/ipykernel_85835/4257582953.py:1](https://file+.vscode-resource.vscode-cdn.net/var/folders/66/bjmfs8315pjdbztwxmyvd7x80000gn/T/ipykernel_85835/4257582953.py:1): FutureWarning: Downcasting object dtype arrays on .fillna, .ffill, .bfill is deprecated and will change in a future version. Call result.infer_objects(copy=False) instead. To opt-in to the future behavior, set `pd.set_option('future.no_silent_downcasting', True)`
#   a.fillna(True, downcast=True)

Oh no, it's deprecated as well, not I got TWO warnings...

Feature Description

Restore the functionality of .fillna WITHOUT the Warning.

a = pd.Series([True, False, None])
a.fillna(True)
# 0     True
# 1    False
# 2     True
# dtype: bool

Alternative Solutions

Currently, the only "correct" option is to use nullable Boolean type.

a.astype('boolean').fillna(True).astype(bool)
# or, if we're happy keeping the type `boolean`...
a.astype('boolean').fillna(True)

This is overly verbose, but would be acceptable if boolean was inferred automatically for [True, False, None] (or [True, False, np.nan]), but currently it's not...

(The additional confusion is that integer nullable types are distinguished by uppercase (int64 -> Int64) but boolean nullable type isn't (bool -> boolean)... so it took me a very long time to even find this solution!)

Additional Context

No response

@tomprimozic tomprimozic added Enhancement Needs Triage Issue that has not been reviewed by a pandas team member labels Sep 18, 2024
@rhshadrach
Copy link
Member

Thanks for report! You mention that infer_objects didn't work, I'm not sure what you meant by this. Is it that you still get the warning? You need to use infer_objects with the future behavior like so:

ser = pd.Series([True, False, None])
with pd.option_context("future.no_silent_downcasting", True):
      print(ser.fillna(True).infer_objects(copy=False))

@rhshadrach rhshadrach added Needs Discussion Requires discussion from core team before further action Needs Info Clarification about behavior needed to assess issue Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate Dtype Conversions Unexpected or buggy dtype conversions and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Sep 18, 2024
@WillAyd
Copy link
Member

WillAyd commented Sep 24, 2024

This is overly verbose, but would be acceptable if boolean was inferred automatically for [True, False, None] (or [True, False, np.nan]), but currently it's not...

This is another good area where PDEP-13 would probably help a lot #58455

If it helps, you can avoid the object-dtype based "boolean" array without an astype if you use the pd.BooleanDtype dtype argument as part of the constructor:

a = pd.Series([True, False, None], dtype=pd.BooleanDtype())

Yes it is more verbose, but the NA handling is a lot more sane than the default types baked into pandas

You can find a larger discussion of your issue in #57734

@joaoe
Copy link

joaoe commented Nov 26, 2024

Hi. I just hit this "bug" or feature. It's annoying.

Here's my fix

def bool_fillna_inplace(series: pd.Series) -> pd.Series:
    """
    Workaround for https://github.com/pandas-dev/pandas/issues/59831
    Replaces nan/None with False."""
    series[pd.isna(series)] = False
    return series

or an alternative return series & ~pd.isna(series) but that one does a bitwise and which would return False for even number, in case the series is not purely boolean with nans.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Dtype Conversions Unexpected or buggy dtype conversions Enhancement Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate Needs Discussion Requires discussion from core team before further action Needs Info Clarification about behavior needed to assess issue
Projects
None yet
Development

No branches or pull requests

4 participants