-
Notifications
You must be signed in to change notification settings - Fork 173
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: Make ANSI fallback more granular #509
Conversation
@kazuyukitanimura I have more work to do in this PR but wanted to make you aware of it since it is related to Spark 4 support. |
I am looking into the test failures:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Some minor comments.
* Expression evaluation modes. | ||
* - LEGACY: the default evaluation mode, which is compliant to Hive SQL. | ||
* - ANSI: a evaluation mode which is compliant to ANSI SQL standard. | ||
* - TRY: a evaluation mode for `try_*` functions. It is identical to ANSI evaluation mode |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is conf.ansiEnabled
ignored for try_*
? Maybe the comment can clarify.
evalMode match { | ||
case CometEvalMode.LEGACY => ExprOuterClass.EvalMode.LEGACY | ||
case CometEvalMode.TRY => ExprOuterClass.EvalMode.TRY | ||
case CometEvalMode.ANSI => ExprOuterClass.EvalMode.ANSI |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a 'catch all' case for when someone tries to change CometEvalMode and things don't work as planned?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I have added this.
case c @ Cast(child, dt, timeZoneId, _) => | ||
handleCast(child, inputs, dt, timeZoneId, evalMode(c)) | ||
|
||
case expr: Add if evalMode(expr) == CometEvalMode.ANSI && !cometAnsiEnabled => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we move these to be next to the corresponding supported implementations so new developers can discover this case easily?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code has been refactored significantly since this comment. There is now a single match arm for ANSI fallback and there is an isUnsupportedAnsiExpr
method which determines if the expression is one where we do not yet have ANSI support.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks neat!
withInfo(expr, ansiNotSupported) | ||
None | ||
|
||
case expr: Multiply if evalMode(expr) == CometEvalMode.ANSI && !cometAnsiEnabled => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can add a simple method containing all these checks together, instead of scattering them here.
if (isANSIEnabled(conf)) { | ||
if (COMET_ANSI_MODE_ENABLED.get()) { | ||
logWarning("Using Comet's experimental support for ANSI mode.") | ||
} else { | ||
logInfo("Comet extension disabled for ANSI mode") | ||
return plan | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For Spark 3.x, I feel we still need this protection in case users enable ANSI mode as well as native execution.
If I understand correctly we have not set up CI yet with ANSI enabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This current check will make all plans fall back to Spark if ANSI is enabled, so no native plans will run, unless COMET_ANSI_MODE_ENABLED
is enabled.
The changes in this PR mean that we still have the same check but only if the plan actually contains any expressions that would be affected by enabling ANSI support and we still fall back to Spark by default unless COMET_ANSI_MODE_ENABLED
is enabled.
The main risk with this PR is if we don't have the complete list of expressions that support ANSI mode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand correctly we have not set up CI yet with ANSI enabled.
Well, we have the Spark 4 tests where ANSI is enabled and that is catching issues for sure.
I noticed that we were explicitly disabling ANSI mode in CometTestBase and I have removed that so we use the default ANSI mode for whatever Spark version we are running against. This should mean that all of our unit tests will now run with ANSI enabled when running against Spark 4+. Let's see how many issues this finds 😰
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand correctly we have not set up CI yet with ANSI enabled.
It would be good to add a CI pipeline for Spark 3.4 with ANSI enabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am close to finish my PR to enable the spark tests with Spark 4.0 I.e. ANSI enabled, and I am disabling expressions that failed those tests with ANSI enabled.
I would propose to hold this part of change for now. Perhaps after we fix all Spark 4.0 issues, we can backport the learning to Spark 3.4
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. I have moved this to draft for now.
This is a pretty interesting test failure:
I am assuming that this was due to us not supporting the ANSI overflow checks in SUM or AVG. We now fall back for those if ANSI is enabled. |
@@ -69,7 +69,6 @@ abstract class CometTestBase | |||
conf.set("spark.hadoop.fs.file.impl", classOf[DebugFilesystem].getName) | |||
conf.set("spark.ui.enabled", "false") | |||
conf.set(SQLConf.SHUFFLE_PARTITIONS, 10) // reduce parallelism in tests | |||
conf.set(SQLConf.ANSI_ENABLED.key, "false") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We no longer explicitly disable ANSI in our test suites.
case c @ Cast(child, dt, timeZoneId, _) => | ||
handleCast(child, inputs, dt, timeZoneId, evalMode(c)) | ||
|
||
case expr if isUnsupportedAnsiExpr(expr) && !CometConf.COMET_ANSI_MODE_ENABLED.get() => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel !CometConf.COMET_ANSI_MODE_ENABLED.get()
condition needs to be removed.
When COMET_ANSI_MODE_ENABLED=true
, we still need to fallback to Spark if the expression is not ANSI compatible
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, let's remove this. It was intended as as short term workaround but I agree that we don't really need it now that we have this fallback logic.
@@ -69,7 +69,7 @@ abstract class CometTestBase | |||
conf.set("spark.hadoop.fs.file.impl", classOf[DebugFilesystem].getName) | |||
conf.set("spark.ui.enabled", "false") | |||
conf.set(SQLConf.SHUFFLE_PARTITIONS, 10) // reduce parallelism in tests | |||
conf.set(SQLConf.ANSI_ENABLED.key, "false") | |||
conf.set(CometConf.COMET_ANSI_MODE_ENABLED.key, "true") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about leaving COMET_ANSI_MODE_ENABLED
as default value as well? I.e. false for 3.x, true for 4.0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, good point
Which issue does this PR close?
Part of #313
Rationale for this change
Rather than fall back to Spark for all plans when ANSI mode is enabled, it would be better to only fall back to Spark if the plan involves any expressions that have ANSI-specific behavior that we do not yet support.
What changes are included in this PR?
CometEvalMode
enum to use instead of string literalsRound
andCast
Round
in ANSI modeHow are these changes tested?
TBD