- 
          
- 
                Notifications
    You must be signed in to change notification settings 
- Fork 19.2k
          CLN: enforce deprecation of interpolate with object dtype
          #57820
        
          New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
  
    CLN: enforce deprecation of interpolate with object dtype
  
  #57820
              Conversation
| I enforced deprecation of  | 
        
          
                pandas/core/generic.py
              
                Outdated
          
        
      | if np.any(obj.dtypes == object): | ||
| # GH#53631 | ||
| if not (obj.ndim == 2 and np.all(obj.dtypes == object)): | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you combine these conditionals? np.any and np.all overlap
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you for the comment. I combined two if statement into one.
        
          
                pandas/core/generic.py
              
                Outdated
          
        
      | ) | ||
| # GH#53631 | ||
| if np.any(obj.dtypes == object) and ( | ||
| obj.ndim == 1 or not np.all(obj.dtypes == object) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like part of this is already handled below (if obj.ndim == 2 and np.all(obj.dtypes == object)).
In instead nearby that check could you add a if obj.ndim == 1 and obj.dtype == object:?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you. To avoid duplicate checks I changed blocks of if statements as you suggested.
        
          
                pandas/core/generic.py
              
                Outdated
          
        
      | if np.all(obj.dtypes == object): | ||
| raise TypeError( | ||
| "Cannot interpolate with all object-dtype columns " | ||
| "in the DataFrame. Try setting at least one " | ||
| "column to a numeric dtype." | ||
| ) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can now simplify things and remove this check. The reason: we raise
TypeError(
    f"{type(self).__name__} cannot interpolate with object dtype."
)
in case np.any(obj.dtypes == object) below and there is no need to raise Type Error if np.all(obj.dtypes == object). Should I remove raising
if np.all(obj.dtypes == object):
    raise TypeError(
        "Cannot interpolate with all object-dtype columns "
        "in the DataFrame. Try setting at least one "
        "column to a numeric dtype."
    )
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch! Yeah try that out
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But the messaging will need to probably change, this could be a Series
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry, I don't understand why is it a Series? I thought it's a DataFrame, because we are in the block
if obj.ndim == 2:
I suggest to use the message  f"{type(self).__name__} cannot interpolate with object dtype.", the same one we use in the block below
elif np.any(obj.dtypes == object):
    raise TypeError(
        f"{type(self).__name__} cannot interpolate with object dtype."
    )
I think we can just remove the block with the old message:
if obj.ndim == 2:
    if np.all(obj.dtypes == object):
...
and change the old message in tests to the new one "DataFrame cannot interpolate with object dtype"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah sorry I had misunderstood your initial suggestion.
You can also go a step further and replace All of these checks with
if np.any(obj.dtypes == object):
    raise TypeError(f"{type(self).__name__} cannot interpolate with object dtype.")Since Series.dtypes and DataFrame.dtypes can be checked at the same time
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry, my wording wasn't clear.
My idea was to remove the second part of the error message:
"Cannot interpolate with all object-dtype columns in the DataFrame. Try setting at least one column to a numeric dtype."
"Try setting at least one column to a numeric dtype" won't work from now on because of the enforcing in this PR. Then I thought why not to replace the first part: "Cannot interpolate with all object-dtype columns in the DataFrame." with a new msg: "... cannot interpolate with object dtype."
If we do this we can remove one check block.
I made changes. Could you please take a look at it?
| Thanks @natmokval | 
| FutureWarning, | ||
| stacklevel=find_stack_level(), | ||
| ) | ||
| raise TypeError( | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd suggest doing this check inside Block.interpolate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there's a TODO(3.0) comment related to this inside Block.interpolate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you, I found the comment. In #58083 I moved raising TypeError from NDFrame to Block.
I think maybe it's better to change the error message from  "NumpyBlock cannot interpolate with object dtype." to "Can not interpolate with object dtype."?
…ev#57820) * remove interpolate with object dtype * enforce deprecation interpolate with object dtype, correct tests * fix ruff error * add a note to v3.0.0 * combine two conditions * change blocks of if statements to avoid duplicate checks * replace err msg containing 'Try setting at least one column to a numeric dtype' * simplify if condition
xref #53638
enforced deprecation of
interpolatewith object dtype