-
-
Notifications
You must be signed in to change notification settings - Fork 18.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH/PERF: enable column-wise reductions for EA-backed columns #32867
ENH/PERF: enable column-wise reductions for EA-backed columns #32867
Conversation
Come to think of it, the place where this dispatch belongs may be in the relevant nanops functions |
Personally, I prefer the nanops to be about ops on numpy arrays, and not deal with extension arrays |
pandas/core/frame.py
Outdated
@@ -7898,6 +7915,19 @@ def _get_data(axis_matters): | |||
raise NotImplementedError(msg) | |||
return data | |||
|
|||
def blk_func(values): | |||
if isinstance(values, ExtensionArray): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
with this inside blk_func, shouldn't the block-wise operation have the same performance bump as the column-wise?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
with this inside blk_func, shouldn't the block-wise operation have the same performance bump as the column-wise?
It didn't actually change anything performance wise (it's the same function being called as before).
The reason that both paths have different performance, is because the re-assembling of the results into a Series is more expensive for the block-wise compared to column-wise.
(it's possible that the block-wise way could be optimized to get rid of this difference though. The main thing is that the block results are not in order)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(it's possible that the block-wise way could be optimized to get rid of this difference though. The main thing is that the block results are not in order)
This would be really nice.
So the bigger question is: how can we get to use this by default (at least for EAs). Some ideas:
|
I've got a branch that identifies the cases where frame_apply is used and does that before the .values call. Trying to 1) avoid a .values call and 2) de-nest _reduce. I'll move that branch up the priority-list. |
Depends on the public-ness of those functions. ATM their docstrings have examples with Series, not sure if we have other docs or tests with those. |
Note that my suggestion was to eliminate this part entirely (by using block-wise for all). So let's ensure we don't do duplicate / conflicting work. Does the branch already have something? (can you maybe push it to your fork?)
nanops are not public (regardless of their docstrings). What I meanly meant is that (in my head) they are meant to work on numpy arrays (whether extracted from a Series first or not) |
https://github.com/jbrockmendel/pandas/tree/cln-reduce
This would require making |
Thanks!
Yes, but as long as we keep the |
hmm block-wise and column-wise with ignore_failures wont necessarily be equivalent for object-dtype |
Ah, that's a good point. So for ObjectBlock, we would still need to do it column-wise, if we want to get rid of the fallback in general. Will take a further look one of the coming days if that looks feasable |
Co-authored-by: Simon Hawkins <simonjayhawkins@gmail.com>
Co-authored-by: Simon Hawkins <simonjayhawkins@gmail.com>
Do you have something more concrete as feedback? I don't find this particularly complex. It adds some more code, for sure, to ensure we perform the reductions properly column-wise with the correct operation (which fixes bugs). Note that there was still code that could be removed (so I was actually replacing something, not only adding). I removed that now, to make this more clear (there was a comment about it) |
A suggestion for the short-term, (i.e. to address #35112) is to change the inline-defined
in a dedicated PR. I think that would address a subset of what this PR is doing and is orthogonal to the rest of this. |
this needs to wait for 1.2 |
moving to 1.2 |
I am fine with that, if we then include #35254 instead |
@jorisvandenbossche I think #36076 has a bearing on this, thoughts? |
But in current behavior, a similar thing happens when the >>> ddf=pd.DataFrame([[1,2,3]],columns=['a','b','c'],dtype=object)
>>> ddf
a b c
0 1 2 3
>>> ddf.sum()
a 1.0
b 2.0
c 3.0
dtype: float64
>>> ddf.dtypes
a object
b object
c object
dtype: object
>>> ddf['a'].sum()
1
>>> type(_)
<class 'int'>
>>> |
@Dr-Irv ah, thanks, that's something we should then also test. And another thing to look into ;) |
This pull request is stale because it has been open for thirty days with no activity. Please update or respond to this comment if you're still interested in working on this. |
@jorisvandenbossche i think this is closable, as it really isnt actionable until we get the #36076-and-similar inconsistencies fixed, and at that point we'll be able to go all blockwise (which for ArrayManager will be columnwise anyway) |
looks ok, but not for 1.2 |
@jorisvandenbossche does this still have a perf impact given that we don't go through .values anymore for axis=0? Or is the perf impact all in re-assembling blockwise results? We now always use EA._reduce I think, so that part of the motivation should no longer be relevant. As @jreback referred to in his previous comment, we've gone to a lot of trouble to simplify DataFrame._reduce, i'm wary of adding more re-complexifying it. |
@jorisvandenbossche pls update or close |
is this still relevant? |
@jorisvandenbossche closing as stale. reopen when ready. |
Currently, for reductions on a DataFrame, we convert the full DataFrame to a single "interleaved" array and then perform the operation. That's the default, but when
numeric_only=True
is specified, it is done block-wise.Enabling column-wise reductions (or block-wise for EAs):
For illustration purposes, I added a
column_wise
keyword in this PR (not meant to keep this, just for testing), so we can compare a few cases:So I experimented with two approaches:
_reduce
of the underlying EA (this path gets taken by using the temporary keywordcolumn_wise=True
)numeric_only=True
) by changing that to also use_reduce
of the EA (currently this was failing by calling nanops functions on the EA)The first gives better performance (it is simpler in implementation by not involding the blocks), but requires some more new code (it uses less the existing machinery).
Ideally, for EA columns, we should always use their own reduction implementation (thus call
EA._reduce
), I think. So for both approaches, the question will be how to trigger this behaviour.Closes #32651, closes #34520