Series doesn't implement floor/ceil ops (+EA support needed) #26892

ghost · 2019-06-16T21:19:40Z

In [19]: import pandas as pd
    ...: s=pd.Series([1,2])
    ...: s.floor() 
AttributeError: 'Series' object has no attribute 'floor'

TomAugspurger · 2019-06-17T20:25:13Z

I'd prefer users just .apply(np.floor). I don't think these are broadly useful enough to warrant a method, especially as they don't work for all dtypes.

ghost · 2019-06-18T00:34:41Z

Reasonable, but another case to consider for EA.

s=pd.Series(DecimalArray(to_decimal([1,2,3])))
s.apply(np.floor)

pandas/core/series.py in apply(self, func, convert_dtype, args, **kwds)
   3705         with np.errstate(all='ignore'):
   3706             if isinstance(f, np.ufunc):
-> 3707                 return f(self)
   3708 
   3709             # row-wise access

AttributeError: 'decimal.Decimal' object has no attribute 'floor'

I think this calls __array__ on the array, which converts ito tobject,
and then the np.floor ufunc sees an object, and tries to all floor on it,
which fails. and yet math.floor(decimal) returns an int, (which might
be also be cast to decimal by the EA).

So I think right now, s.apply(np.floor) doesn't work for EA,
and can't without changes to pandas.

TomAugspurger · 2019-06-18T01:25:39Z

What specifically do you mean by changes to pandas? DecimalArray is a test-only extension array. It’s not part of the public API. So changes to it aren’t necessary for this issue. More generally, a user could apply a floor function appropriate for their data.

…

On Jun 17, 2019, at 19:34, pilkibun ***@***.***> wrote: Reasonable, but another case to consider for EA. s=pd.Series(DecimalArray(to_decimal([1,2,3]))) s.apply(np.floor) pandas/core/series.py in apply(self, func, convert_dtype, args, **kwds) 3705 with np.errstate(all='ignore'): 3706 if isinstance(f, np.ufunc): -> 3707 return f(self) 3708 3709 # row-wise access AttributeError: 'decimal.Decimal' object has no attribute 'floor' I think this calls __array__ on the array, which converts ito tobject, and then the np.floor ufunc sees an object, and tries to all floor on it, which fails. and yet math.floor(decimal) returns an int, (which might be also be cast to decimal by the EA). So I think right now, s.apply(np.floor) doesn't work for EA, and can't without changes to pandas. — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

ghost · 2019-06-18T01:33:29Z

Well, I was using Decimal as a concrete example. EA of its "kind" aren't handled gracfully in this situation
in that s.apply(np.floor) works for numpy dtypes but not EA.

More generally, a user could apply a floor function appropriate for their data.

Yes, but in the case of something like pint, that means accessing an internal array
two attribute levels deep, and casting back to the proper EA, and wrapping by a Series ctor,
or something about as cumbersome. I though the purpose of EA was to have them as
seamless as "native" pandas dtypes.

TomAugspurger · 2019-06-18T01:43:46Z

Ok, let’s try rephrasing. Assuming we don’t add a Series.floor, what exactly needs to change in pandas to support your use case?

…

On Jun 17, 2019, at 20:33, pilkibun ***@***.***> wrote: Well, I was using Decimal as a concrete example. EA of its "kind" aren't handled gracfully in this situation in that s.apply(np.floor) works for numpy dtypes but not EA. More generally, a user could apply a floor function appropriate for their data. Yes, but in the case of something like pint, that means accessing an internal array two attribute levels deep, and casting back to the proper EA, and wrapping by a Series ctor, or something about as cumbersome. I though the purpose of EA was to have them as seamless as "native" pandas dtypes. — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

TomAugspurger · 2019-06-18T01:46:19Z

One important point: pandas makes no assumptions about how the EA is stored, so things like “works for numpy dtypes but not EA” probably isn’t correct. A specific EA may or may not work with NumPy’s API.

…

On Jun 17, 2019, at 20:33, pilkibun ***@***.***> wrote: Well, I was using Decimal as a concrete example. EA of its "kind" aren't handled gracfully in this situation in that s.apply(np.floor) works for numpy dtypes but not EA. More generally, a user could apply a floor function appropriate for their data. Yes, but in the case of something like pint, that means accessing an internal array two attribute levels deep, and casting back to the proper EA, and wrapping by a Series ctor, or something about as cumbersome. I though the purpose of EA was to have them as seamless as "native" pandas dtypes. — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

ghost · 2019-06-18T02:07:20Z

A specific EA may or may not work with NumPy’s API.

whether or it chooses to implement a numpy api is a separate issue. Whether pandas actually invokes that implementation when the series method is called is another. And I'm talking about
the latter.

In particular, all series methods/ops which work by calling np.asarray(self) or relying on the default ~~EA.__array__ (which calls np.asarray(self.array))~~ Series.__array__, fail this test. an EA may override __array__ but numpy enforces ndarray type on the result, so the dtype information is necessarily lost.

I think apply fails this test. And not only np.floor(s) fail for an EA, but also np.cos(s) and s.apply(np.cos). The other trig functions are the same.

TomAugspurger · 2019-06-18T02:22:40Z

Does apply call asarray?

ghost · 2019-06-19T00:31:29Z

Series.apply(ufunc) calls ufunc(self) which invokes Series.__array__ to get something it get work on, which calls np.asarray(self.array, dtype). The result is an object ndarray.

TomAugspurger · 2019-06-19T02:26:33Z

#23293 would seem to solve that. I hope to finish that up for the 0.25 release.

…

On Tue, Jun 18, 2019 at 7:31 PM pilkibun ***@***.***> wrote: Series.apply(ufunc) calls ufunc(self) which invokes Series.__array__ to get something it get work on, which calls np.asarray(self.array, dtype). The result is an object ndarray. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#26892?email_source=notifications&email_token=AAKAOIQOOVYADJK3VDQBW2TP3F46RA5CNFSM4HYSBGDKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODYAKXZI#issuecomment-503360485>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAKAOIVLWI647UZKY725L4LP3F46RANCNFSM4HYSBGDA> .

TomAugspurger · 2019-06-19T02:36:41Z

As a short demo, consider the following incomplete Series.__array_ufunc__.

diff --git a/pandas/core/series.py b/pandas/core/series.py
index eaef1f525..2bc1dfe1a 100644
--- a/pandas/core/series.py
+++ b/pandas/core/series.py
@@ -701,6 +701,11 @@ class Series(base.IndexOpsMixin, generic.NDFrame):
     # ----------------------------------------------------------------------
     # NDArray Compat
 
+    def __array_ufunc__(self, ufunc, method, *inputs, **kwargs):
+        arrays = tuple(x.array for x in inputs)
+        result = getattr(ufunc, method)(*arrays)
+        return self._constructor(result, index=self.index).__finalize__(self)
+
     def __array__(self, dtype=None):
         """
         Return the values as a NumPy array.

The basic idea is to extract the arrays from the series, apply the ufunc, and box the result in a Series. For Series[Sparse], we dispatch to SparseArray which implements __array_ufunc__

In [1]: import pandas as pd; import numpy as np

In [2]: s = pd.Series(pd.SparseArray([-1, 0, 1]))

In [3]: np.sin(s)
Out[3]:
0   -0.841471
1    0.000000
2    0.841471
dtype: Sparse[float64, 0.0]

In [4]: s.apply(np.sin)
Out[4]:
0   -0.841471
1    0.000000
2    0.841471
dtype: Sparse[float64, 0.0]

On master, that's dense.

ghost · 2019-06-19T02:45:20Z

Yes, that's fine in that if the EA implements its own __array_ufunc__ it can take over. But I'm very unhappy with the patchwork Frankenstein way in which different classes of methods rely on different mechanisms for EA authors to implement them. While other (cumprod, for example) some simply don't allow it. See how horrible it looks when documented: #26918.

I've opened #26935 suggesting something better needs to take shape.

TomAugspurger · 2019-06-19T02:47:31Z

See how horrible it looks when documented

Please don't dismiss the work we've put into it.

ghost changed the title ~~Numeric Series support common floor/ceil operations~~ Series doesn't implement floor/ceil ops Jun 17, 2019

TomAugspurger added the API Design label Jun 17, 2019

ghost mentioned this issue Jun 18, 2019

Tracking issue for EA Series Operations Support #26913

Closed

ghost changed the title ~~Series doesn't implement floor/ceil ops~~ Series doesn't implement floor/ceil ops (+EA support needed) Jun 18, 2019

ghost mentioned this issue Jun 18, 2019

WIP [DOC/EA]: developer docs for implementing Series.round/sum/etc in EA #26918

Closed

ghost closed this as completed Jul 8, 2019

This issue was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Series doesn't implement floor/ceil ops (+EA support needed) #26892

Series doesn't implement floor/ceil ops (+EA support needed) #26892

ghost commented Jun 16, 2019

TomAugspurger commented Jun 17, 2019

ghost commented Jun 18, 2019

TomAugspurger commented Jun 18, 2019 via email

ghost commented Jun 18, 2019

TomAugspurger commented Jun 18, 2019 via email

TomAugspurger commented Jun 18, 2019 via email

ghost commented Jun 18, 2019 •

edited by ghost

Loading

TomAugspurger commented Jun 18, 2019

ghost commented Jun 19, 2019

TomAugspurger commented Jun 19, 2019 via email

TomAugspurger commented Jun 19, 2019 •

edited

Loading

ghost commented Jun 19, 2019 •

edited by ghost

Loading

TomAugspurger commented Jun 19, 2019

Series doesn't implement floor/ceil ops (+EA support needed) #26892

Series doesn't implement floor/ceil ops (+EA support needed) #26892

Comments

ghost commented Jun 16, 2019

TomAugspurger commented Jun 17, 2019

ghost commented Jun 18, 2019

TomAugspurger commented Jun 18, 2019 via email

ghost commented Jun 18, 2019

TomAugspurger commented Jun 18, 2019 via email

TomAugspurger commented Jun 18, 2019 via email

ghost commented Jun 18, 2019 • edited by ghost Loading

TomAugspurger commented Jun 18, 2019

ghost commented Jun 19, 2019

TomAugspurger commented Jun 19, 2019 via email

TomAugspurger commented Jun 19, 2019 • edited Loading

ghost commented Jun 19, 2019 • edited by ghost Loading

TomAugspurger commented Jun 19, 2019

ghost commented Jun 18, 2019 •

edited by ghost

Loading

TomAugspurger commented Jun 19, 2019 •

edited

Loading

ghost commented Jun 19, 2019 •

edited by ghost

Loading