-
Notifications
You must be signed in to change notification settings - Fork 235
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Accommodate altered semantics of cudf::lists::contains()
#4361
Accommodate altered semantics of cudf::lists::contains()
#4361
Conversation
cf03bf9
to
aa97aa9
Compare
build |
This PR is ready for review, although it's still in draft. It can't be merged until the dependencies in CUDF have been merged. |
Signed-off-by: MithunR <mythrocks@gmail.com>
aa97aa9
to
2b18f80
Compare
(Modified for changed CUDF function name. |
…-contains-semantics
Re-tested after changes to dependencies in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few minor nits
sql-plugin/src/main/scala/org/apache/spark/sql/rapids/complexTypeExtractors.scala
Outdated
Show resolved
Hide resolved
rapidsai/cudf/pull/9510 was just committed. I've pulled this PR out of draft. |
build |
Even though rapidsai/cudf/pull/9510 has not gone all the way through CI yet. I think we can still test/commit this. Because the extra processing should be a noop if listContains is behaving the old way too. |
oops I was wrong. It needs a new API that was missing before. |
build |
Merging this to head off impending CI breakage. |
Thank you for the review, @revans2. @tgravescs, thank you for confirming that the CI has passed with the new |
Depends on rapidsai/cudf/pull/9510.
Depends on rapidsai/cudf/pull/9901.
spark-rapids
relies onlibcudf
'scudf::lists::contains()
for its implementation ofarray_contains()
.rapidsai/cudf/pull/9510 changes the semantics of
lists::contains()
, with regard to rows containing nulls. Specifically, if a list row contains at least one null element, and is found not to contain the search key,libcudf
will now returnfalse
instead ofnull
.SparkSQL expects to return
null
in those cases.This commit accommodates the change in
libcudf's
semantics, to keep its own existing behaviour.