-
Notifications
You must be signed in to change notification settings - Fork 4k
ARROW-1487: [C++] Implement casts from List<A> to List<B>, where a cast function is defined from any A to B #1383
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a couple issues with this:
a) We should not perform any additional copying or allocation beyond what is required to perform the cast on the child, i.e. we should simply reuse the buffers from the input array without copying
b) It would be better to resolve the cast kernel for the child data when the kernel is resolved in GetCastFunction. Instead of defining a functor for list, instead create an alternative implementation of UnaryKernel that has the child cast kernel as a member variable. This is what I meant in https://issues.apache.org/jira/browse/ARROW-1487 by "If a cast rule for the child type is not implemented, the failure should occur during kernel selection rather than at kernel evaluation time."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In here, per below, we will want to verify that the casted result retains zero-copy references to the buffers from the parent array.
There is another nuance to be aware of, which is that the offset member may be non-zero, so in that case simply reusing the buffers is not the appropriate action
|
@wesm Thanks for your review! Now revised the implementation (but the test codes is not yet). Is this the right direction? |
96871a2 to
f31fcde
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sort of thinking of having a ListCastKernel : public UnaryKernel implementation. Would you mind if I took a pass at working on this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wesm IMO, we don't have to create the special case like ListCastKernel. I think simply checking child Field is sufficient.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can take a look at what I just did -- for the moment casting sliced ListArray raises an exception, we'll need to do some more work to make that work
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can generalize to support struct casts and have instead a NestedCastKernel when the time comes. I think we should defer that a bit until we've dealt with the sliced list case. The problem with the current CastKernel is that it's tied up in the business of memory allocation for primitive arrays. We'll keep needing to do some refactoring around this as we implement more kernels which are capable of writing into pre-allocated memory
Change-Id: I6202c1de8663c0bec9408e158eb19aa29636426f
Change-Id: I7ed4b2d7ff7ab800ec2dc8d5df794dc2c6b1bdd7
Change-Id: I7167f22768b935202aabc57a9635b0ede13e2f03
295c9f0 to
53a21a5
Compare
|
macOS build stalled out due to ARROW-1880 |
|
Thnaks @wesm! I understand why normal |
This closes ARROW-1487.