-
Notifications
You must be signed in to change notification settings - Fork 5.4k
Add ARRAYS_OVERLAP_COUNT implementation, documentation and tests #24926
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
@SqlType("array(E)") Block leftArray, | ||
@SqlType("array(E)") Block rightArray) | ||
{ | ||
return ArrayIntersectFunction.intersect(type, elementIsDistinctFrom, leftArray, rightArray).getPositionCount(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it will be good to avoid calling intersect as it will create a new block
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! (docs)
Pull branch, local doc build, looks good. Thanks!
You could include a release note for this, and link to the documentation for the new function as shown in the following example:
|
@steveburnett looks like another intermittent test failure, could you re-run this one as well please? thanks! |
44f79f3
to
c9b1439
Compare
...-main-base/src/main/java/com/facebook/presto/operator/scalar/ArraysOverlapCountFunction.java
Show resolved
Hide resolved
} | ||
|
||
@Test | ||
public void testIndeterminateRows() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all other tests besides this one pass, would it be okay if we landed this function with the caveat that indeterminate rows aren't supported? I'm pretty sure they'll be a very small subset of inputs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we have been trying to standardize on IS DISTINCT FROM like semantics for NULLs. @rschlussel - can you confirm?
c9b1439
to
9711d4c
Compare
...-main-base/src/main/java/com/facebook/presto/operator/scalar/ArraysOverlapCountFunction.java
Show resolved
Hide resolved
50e23aa
to
2a5830c
Compare
2a5830c
to
93e0c37
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you revert the previous change? was there an issue?
yeah reverted it - it made the duplicates test fail, reverting it fixed it (besides the test rows and arrays with null with them, which are still failing) |
Hmm - that's odd! That logic is supposed to be sound! |
What needs to be addressed in the PR to move it towards merging? Thanks! |
I think we're waiting for @rschlussel to confirm whether I can go ahead with the current implementation |
Description
Requested in #24911
I went with the simplest and least error-prone approach and simply make the function invoke the intersect function and return the amount of items in the returned array.
Motivation and Context
See #24911
Impact
See #24911
Test Plan
Wrote a lot of tests, the vast majority of which I adapted from TestArrayIntersectFunction
Should I also benchmark it? Don't know how exactly that's done here, would appreciate some pointers, thanks!
Contributor checklist
Release Notes
Please follow release notes guidelines and fill in the release notes below.
== RELEASE NOTES ==
General Changes
arrays_overlap_count(x, y) -> bigint()
function to return the amount of overlapping elements between arrayx
andy
without duplicates.