-
Notifications
You must be signed in to change notification settings - Fork 29k
[MINOR][SQL] The analyzer rules are fired twice for cases when AnalysisException is raised from analyzer. #17214
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…eption is raised from analyzer rules
|
cc @gatorsmile @cloud-fan Please let me know your thoughts. |
| protected def planner = sparkSession.sessionState.planner | ||
|
|
||
| def assertAnalyzed(): Unit = { | ||
| try sparkSession.sessionState.analyzer.checkAnalysis(analyzed) catch { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't analyzed a lazy val?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cloud-fan Yeah wenchen. so first time we invoke analyzer is on this line. And upon exception we go to catch block -
case e: AnalysisException =>
val ae = new AnalysisException(e.message, e.line, e.startPosition, Some(analyzed))
and call analyzer the second time while trying to evaluate the last parameter - Some(analyzed) ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't lazy val only be evaluated once?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cloud-fan Actually thats what i also thought Wenchen. But it seems like if an exception occurred before the assignment to lazy val happens , then it treats it like the first evaluation never happened ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so we just need a line before this line, to materialize analyzed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cloud-fan did you mean like this ?
def assertAnalyzed(): Unit = {
try {
analyzed
sparkSession.sessionState.analyzer.checkAnalysis(analyzed)
} catch {
case e: AnalysisException =>
val ae = new AnalysisException(e.message, e.line, e.startPosition, Some(analyzed))
ae.setStackTrace(e.getStackTrace)
throw ae
}
}If so, it also causes two invocation like before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
put analyzed out of the try catch
|
Test build #74219 has finished for PR 17214 at commit
|
|
|
||
| def assertAnalyzed(): Unit = { | ||
| try sparkSession.sessionState.analyzer.checkAnalysis(analyzed) catch { | ||
| var analyzedPlan: Option[LogicalPlan] = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could make this a local lazy val. That should be a bit more concise. For example:
def assertAnalyzed(): Unit = {
lazy val analyzedPlan = analyzed
try {
sparkSession.sessionState.analyzer.checkAnalysis(analyzedPlan)
} catch {
case e: AnalysisException =>
val ae = new AnalysisException(e.message, e.line, e.startPosition, Option(analyzedPlan))
ae.setStackTrace(e.getStackTrace)
throw ae
}
}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks you @cloud-fan @hvanhovell. I will make the change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hvanhovell I tried the suggested code snippet. Actually since AnalyzedPlan is declared lazy the first evaluation happens inside the try block and hence it has the same problem. So lets just move analyzed outside the try block as wenchen suggests or change lazy val analyzedPlan = analyzed to val analyzedPlan = analyzed if you think it reads better ?
def assertAnalyzed(): Unit = {
analyzed
try {
sparkSession.sessionState.analyzer.checkAnalysis(analyzed)
} catch {
case e: AnalysisException =>
val ae = new AnalysisException(e.message, e.line, e.startPosition, Option(analyzed))
ae.setStackTrace(e.getStackTrace)
throw ae
}
}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh, we want it materialized outside the block. Yeah, in that case wenchen's suggestion is the way to go
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hvanhovell Thank you.
|
LGTM |
|
Test build #74228 has finished for PR 17214 at commit
|
|
Thanks! Merging to master/2.1 |
…isException is raised from analyzer. ## What changes were proposed in this pull request? In general we have a checkAnalysis phase which validates the logical plan and throws AnalysisException on semantic errors. However we also can throw AnalysisException from a few analyzer rules like ResolveSubquery. I found that we fire up the analyzer rules twice for the queries that throw AnalysisException from one of the analyzer rules. This is a very minor fix. We don't have to strictly fix it. I just got confused seeing the rule getting fired two times when i was not expecting it. ## How was this patch tested? Tested manually. Author: Dilip Biswal <dbiswal@us.ibm.com> Closes #17214 from dilipbiswal/analyis_twice. (cherry picked from commit d809cee) Signed-off-by: Xiao Li <gatorsmile@gmail.com>
|
Thank you @gatorsmile @cloud-fan @hvanhovell |
What changes were proposed in this pull request?
In general we have a checkAnalysis phase which validates the logical plan and throws AnalysisException on semantic errors. However we also can throw AnalysisException from a few analyzer rules like ResolveSubquery.
I found that we fire up the analyzer rules twice for the queries that throw AnalysisException from one of the analyzer rules. This is a very minor fix. We don't have to strictly fix it. I just got confused seeing the rule getting fired two times when i was not expecting it.
How was this patch tested?
Tested manually.