-
Notifications
You must be signed in to change notification settings - Fork 235
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a config to switch back to old impl for getJsonObject #10654
Add a config to switch back to old impl for getJsonObject #10654
Conversation
Signed-off-by: Haoyang Li <haoyangl@nvidia.com>
sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuGetJsonObject.scala
Outdated
Show resolved
Hide resolved
…ject.scala Co-authored-by: Gera Shegalov <gshegalov@nvidia.com>
Is this something we can do in 24.04? |
Retargeted to 24.04. |
build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My comments are mostly nits. I am okay with this going in as is, but it would be nice to fix them.
@@ -902,6 +902,15 @@ val GPU_COREDUMP_PIPE_PATTERN = conf("spark.rapids.gpu.coreDump.pipePattern") | |||
.booleanConf | |||
.createWithDefault(true) | |||
|
|||
val ENABLE_GETJSONOBJECT_LEGACY = conf("spark.rapids.sql.getjsonobject.legacy.enabled") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Generally we use camelCase in the names of operations like this.
spark.rapids.sql.getJsonObject.legacy.enabled
But this is very minor, especially with it being an internal config.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
'spark.rapids.sql.getjsonobject.legacy.enabled': 'true'}) | ||
|
||
# Verify that the legacy mode is used and xfail the test which currently passes | ||
@pytest.mark.xfail(reason="https://github.com/NVIDIA/spark-rapids/issues/10218") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we marking a test as xfail using an issue that has been closed? Can we please have an issue that is filed to remove the legacy support instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I filed #10680 as follow-up and remove this xfail.
conf={'spark.sql.parser.escapedStringLiterals': 'true', | ||
'spark.rapids.sql.getjsonobject.legacy.enabled': 'true'}) | ||
|
||
# Verify that the legacy mode is used and xfail the test which currently passes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see how this test does verify that legacy mode is used. I see that it sets it, but there is no verification especially with an xfail, where if we did verify it, then it would not matter because the test is expected to fail very generically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it failed does not mean it was using legacy mode. Updated this test to check if getJsonObject does the normalization to see if it returns the strings as-is.
Signed-off-by: Haoyang Li <haoyangl@nvidia.com>
Signed-off-by: Haoyang Li <haoyangl@nvidia.com>
@@ -99,7 +103,8 @@ def test_get_json_object_spark_unit_tests(query): | |||
['{"big": "' + ('x' * 3000) + '"}']] | |||
assert_gpu_and_cpu_are_equal_collect( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we not verifying that this falls back to the CPU as expected instead? If we have to divide the test up so we know exactly which once fall back and which ones do not I am fine with that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch thanks, none of them are fallback now, I forgot to delete it.
Signed-off-by: Haoyang Li <haoyangl@nvidia.com>
build |
This PR adds a config to switch back to the old impl for getJsonObject.
#10581 used the new jni kernel for getJsonObject. It works correctly, but has worse performance than the old impl from cuDF, so in some cases users may want to switch back to the old impl to avoid performance loss.