Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
EA: support basic 2D operations #27142
EA: support basic 2D operations #27142
Changes from 29 commits
d673008
b2a837b
c2fd1b1
bed5563
29d9dc2
f3ce13c
4fd24c1
c0505ee
c81daeb
4d77dbe
d43ef30
91c979b
bc220c3
421b5a3
7c6df89
203504c
bc12f01
2f18dae
eb0645d
2540933
5b60fb5
96f3ae2
92a0a56
81191bf
91639dd
34cc9e9
444f9f7
7c15b74
768d75d
3b7b2b2
177bfb0
f5cba22
174a1da
1d78fbe
41b49d9
fc331b8
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this change in 0.25? (IIRC yes?), if not can you break it out
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think we only recently deprecated the old behavior. will double-check and see whats appropriate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the reasoning behind making this a decorator as opposed to just having base class method? it seems much simpler and more obvious.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main pushback on this idea was the onus put on EA authors who only support 1D. ATM this decorator mainly just renames
__len__
tosize
so that it can re-defineshape
in a 2D-compatible way. So asking authors to use the decorator instead of doing that re-definition themselves is kind of a wash.But the step after this is to patch
__getitem__
,__setitem__
,take
,__iter__
, all of which are easier to do with the decorator than by asking authors to do it themselves.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the plan still to have an "opt me out of this, I natively support 2D arrays"? If we go down the route of putting every Block.values inside an EA (with PandasArray for the current NumPy-backed Blocks), then we'll want that, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see you're applying this to subclasses, rather than ExtensionArray itself. Motivation for that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
less magic than a metaclass (and my first attempt at using a metaclass failed)
Defined
EA._allows_2d = False
. Authors set that to True if they implement this themselves. This decorator should be updated to check that and be a no-op in that case.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, right, because simply decorating
would work for subclasses right? It has to be a metaclass. I'll poke around at the meta class approach to see what's going on. May be too magical.
But if we do go with a
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand this at a glance. Who does it have to be in
[0, -1]
? (I get that we're only allowingN x 1
and1 X N
arrays, but I don't get whykey[0]
is the one being checked).Just from looking at it, I would expect this to break this
__getitem__
on something likeThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I wrote this with only (1, N) in mind
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does just
return orig_iter(self)
work?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC from one of the proofs-of-concept this actually mattered, but I dont remember which direction it mattered in
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this case be written as something like
? I worry a bit about the cost of calling
__getitem__
on each iteration.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely dont want to call orig_iter here since we want to yield ndim-1 arrays
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if we are still doing this approach, but if we are
I would break out each of the implementations into a well named function that takes cls. then
impelemented_2d
is pretty straightforward to read w/o having to understand the details, you can immediately see what is being changed and the details exist in the functions.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed.
ATM the relevant discussion about how to proceed is in Tom's PR here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this come up in practice? I'm having a hard time judging whether NotImplementedError or False is the right behavior in this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think NotImplementedError is more accurate, but for the current implementation this should never be reached
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't suppose NumPy has a function we can borrow here? I'm not aware of one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
numpy/numpy#13768 is the closest thing I found, but I agree it seems like the kind of thing that should exist.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's a somewhat dumb approach:
Tested as:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can / should we make any requirements on this being no-copy?
i.e.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
definitely needs to be no-copy; i'll add that to the docstring
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is a test for the no-copy bit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be good to document that only
(1, N)
and(N, 1)
is allowed here.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good thinking. ATM I wrote most of this with only (1, N) in mind