-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
Series doesn't implement floor/ceil ops (+EA support needed) #26892
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I'd prefer users just |
Reasonable, but another case to consider for EA. s=pd.Series(DecimalArray(to_decimal([1,2,3])))
s.apply(np.floor)
pandas/core/series.py in apply(self, func, convert_dtype, args, **kwds)
3705 with np.errstate(all='ignore'):
3706 if isinstance(f, np.ufunc):
-> 3707 return f(self)
3708
3709 # row-wise access
AttributeError: 'decimal.Decimal' object has no attribute 'floor' I think this calls So I think right now, |
What specifically do you mean by changes to pandas?
DecimalArray is a test-only extension array. It’s not part of the public API. So changes to it aren’t necessary for this issue.
More generally, a user could apply a floor function appropriate for their data.
… On Jun 17, 2019, at 19:34, pilkibun ***@***.***> wrote:
Reasonable, but another case to consider for EA.
s=pd.Series(DecimalArray(to_decimal([1,2,3])))
s.apply(np.floor)
pandas/core/series.py in apply(self, func, convert_dtype, args, **kwds)
3705 with np.errstate(all='ignore'):
3706 if isinstance(f, np.ufunc):
-> 3707 return f(self)
3708
3709 # row-wise access
AttributeError: 'decimal.Decimal' object has no attribute 'floor'
I think this calls __array__ on the array, which converts ito tobject,
and then the np.floor ufunc sees an object, and tries to all floor on it,
which fails. and yet math.floor(decimal) returns an int, (which might
be also be cast to decimal by the EA).
So I think right now, s.apply(np.floor) doesn't work for EA,
and can't without changes to pandas.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.
|
Well, I was using Decimal as a concrete example. EA of its "kind" aren't handled gracfully in this situation
Yes, but in the case of something like pint, that means accessing an internal array |
Ok, let’s try rephrasing. Assuming we don’t add a Series.floor, what exactly needs to change in pandas to support your use case?
… On Jun 17, 2019, at 20:33, pilkibun ***@***.***> wrote:
Well, I was using Decimal as a concrete example. EA of its "kind" aren't handled gracfully in this situation
in that s.apply(np.floor) works for numpy dtypes but not EA.
More generally, a user could apply a floor function appropriate for their data.
Yes, but in the case of something like pint, that means accessing an internal array
two attribute levels deep, and casting back to the proper EA, and wrapping by a Series ctor,
or something about as cumbersome. I though the purpose of EA was to have them as
seamless as "native" pandas dtypes.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.
|
One important point: pandas makes no assumptions about how the EA is stored, so things like “works for numpy dtypes but not EA” probably isn’t correct. A specific EA may or may not work with NumPy’s API.
… On Jun 17, 2019, at 20:33, pilkibun ***@***.***> wrote:
Well, I was using Decimal as a concrete example. EA of its "kind" aren't handled gracfully in this situation
in that s.apply(np.floor) works for numpy dtypes but not EA.
More generally, a user could apply a floor function appropriate for their data.
Yes, but in the case of something like pint, that means accessing an internal array
two attribute levels deep, and casting back to the proper EA, and wrapping by a Series ctor,
or something about as cumbersome. I though the purpose of EA was to have them as
seamless as "native" pandas dtypes.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.
|
whether or it chooses to implement a numpy api is a separate issue. Whether pandas actually invokes that implementation when the series method is called is another. And I'm talking about In particular, all series methods/ops which work by calling I think |
Does apply call asarray? |
|
#23293 would seem to solve that. I
hope to finish that up for the 0.25 release.
…On Tue, Jun 18, 2019 at 7:31 PM pilkibun ***@***.***> wrote:
Series.apply(ufunc) calls ufunc(self) which invokes Series.__array__ to
get something it get work on, which calls np.asarray(self.array, dtype).
The result is an object ndarray.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#26892?email_source=notifications&email_token=AAKAOIQOOVYADJK3VDQBW2TP3F46RA5CNFSM4HYSBGDKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODYAKXZI#issuecomment-503360485>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAKAOIVLWI647UZKY725L4LP3F46RANCNFSM4HYSBGDA>
.
|
As a short demo, consider the following incomplete diff --git a/pandas/core/series.py b/pandas/core/series.py
index eaef1f525..2bc1dfe1a 100644
--- a/pandas/core/series.py
+++ b/pandas/core/series.py
@@ -701,6 +701,11 @@ class Series(base.IndexOpsMixin, generic.NDFrame):
# ----------------------------------------------------------------------
# NDArray Compat
+ def __array_ufunc__(self, ufunc, method, *inputs, **kwargs):
+ arrays = tuple(x.array for x in inputs)
+ result = getattr(ufunc, method)(*arrays)
+ return self._constructor(result, index=self.index).__finalize__(self)
+
def __array__(self, dtype=None):
"""
Return the values as a NumPy array. The basic idea is to extract the arrays from the series, apply the ufunc, and box the result in a Series. For Series[Sparse], we dispatch to SparseArray which implements In [1]: import pandas as pd; import numpy as np
In [2]: s = pd.Series(pd.SparseArray([-1, 0, 1]))
In [3]: np.sin(s)
Out[3]:
0 -0.841471
1 0.000000
2 0.841471
dtype: Sparse[float64, 0.0]
In [4]: s.apply(np.sin)
Out[4]:
0 -0.841471
1 0.000000
2 0.841471
dtype: Sparse[float64, 0.0] On master, that's dense. |
Yes, that's fine in that if the EA implements its own I've opened #26935 suggesting something better needs to take shape. |
Please don't dismiss the work we've put into it. |
The text was updated successfully, but these errors were encountered: