Skip to content

Conversation

@dilipbiswal
Copy link
Contributor

@dilipbiswal dilipbiswal commented Mar 8, 2017

What changes were proposed in this pull request?

In general we have a checkAnalysis phase which validates the logical plan and throws AnalysisException on semantic errors. However we also can throw AnalysisException from a few analyzer rules like ResolveSubquery.

I found that we fire up the analyzer rules twice for the queries that throw AnalysisException from one of the analyzer rules. This is a very minor fix. We don't have to strictly fix it. I just got confused seeing the rule getting fired two times when i was not expecting it.

How was this patch tested?

Tested manually.

@dilipbiswal
Copy link
Contributor Author

cc @gatorsmile @cloud-fan Please let me know your thoughts.

@dilipbiswal dilipbiswal changed the title [SQL][MINOR] The analyzer rules are fired twice for cases when AnalysisException is raised from analyzer. [MINOR][SQL] The analyzer rules are fired twice for cases when AnalysisException is raised from analyzer. Mar 8, 2017
protected def planner = sparkSession.sessionState.planner

def assertAnalyzed(): Unit = {
try sparkSession.sessionState.analyzer.checkAnalysis(analyzed) catch {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't analyzed a lazy val?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cloud-fan Yeah wenchen. so first time we invoke analyzer is on this line. And upon exception we go to catch block -

     case e: AnalysisException =>
       val ae = new AnalysisException(e.message, e.line, e.startPosition, Some(analyzed))

and call analyzer the second time while trying to evaluate the last parameter - Some(analyzed) ?

Copy link
Contributor

@cloud-fan cloud-fan Mar 8, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't lazy val only be evaluated once?

Copy link
Contributor Author

@dilipbiswal dilipbiswal Mar 8, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cloud-fan Actually thats what i also thought Wenchen. But it seems like if an exception occurred before the assignment to lazy val happens , then it treats it like the first evaluation never happened ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so we just need a line before this line, to materialize analyzed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cloud-fan did you mean like this ?

def assertAnalyzed(): Unit = {
    try {
      analyzed
      sparkSession.sessionState.analyzer.checkAnalysis(analyzed)
    } catch {
      case e: AnalysisException =>
        val ae = new AnalysisException(e.message, e.line, e.startPosition, Some(analyzed))
        ae.setStackTrace(e.getStackTrace)
        throw ae
    }
  }

If so, it also causes two invocation like before.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

put analyzed out of the try catch

@SparkQA
Copy link

SparkQA commented Mar 8, 2017

Test build #74219 has finished for PR 17214 at commit 86582ef.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.


def assertAnalyzed(): Unit = {
try sparkSession.sessionState.analyzer.checkAnalysis(analyzed) catch {
var analyzedPlan: Option[LogicalPlan] = None
Copy link
Contributor

@hvanhovell hvanhovell Mar 8, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could make this a local lazy val. That should be a bit more concise. For example:

def assertAnalyzed(): Unit = {
  lazy val analyzedPlan = analyzed
  try {
    sparkSession.sessionState.analyzer.checkAnalysis(analyzedPlan)
  } catch {
    case e: AnalysisException =>
      val ae = new AnalysisException(e.message, e.line, e.startPosition, Option(analyzedPlan))
      ae.setStackTrace(e.getStackTrace)
      throw ae
  }
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks you @cloud-fan @hvanhovell. I will make the change.

Copy link
Contributor Author

@dilipbiswal dilipbiswal Mar 8, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hvanhovell I tried the suggested code snippet. Actually since AnalyzedPlan is declared lazy the first evaluation happens inside the try block and hence it has the same problem. So lets just move analyzed outside the try block as wenchen suggests or change lazy val analyzedPlan = analyzed to val analyzedPlan = analyzed if you think it reads better ?

def assertAnalyzed(): Unit = {
    analyzed
    try {
      sparkSession.sessionState.analyzer.checkAnalysis(analyzed)
    } catch {
      case e: AnalysisException =>
        val ae = new AnalysisException(e.message, e.line, e.startPosition, Option(analyzed))
        ae.setStackTrace(e.getStackTrace)
        throw ae
    }
  }

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, we want it materialized outside the block. Yeah, in that case wenchen's suggestion is the way to go

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hvanhovell Thank you.

@gatorsmile
Copy link
Member

LGTM

@SparkQA
Copy link

SparkQA commented Mar 9, 2017

Test build #74228 has finished for PR 17214 at commit 0dee6dd.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gatorsmile
Copy link
Member

Thanks! Merging to master/2.1

asfgit pushed a commit that referenced this pull request Mar 9, 2017
…isException is raised from analyzer.

## What changes were proposed in this pull request?
In general we have a checkAnalysis phase which validates the logical plan and throws AnalysisException on semantic errors. However we also can throw AnalysisException from a few analyzer rules like ResolveSubquery.

I found that we fire up the analyzer rules twice for the queries that throw AnalysisException from one of the analyzer rules. This is a very minor fix. We don't have to strictly fix it. I just got confused seeing the rule getting fired two times when i was not expecting it.

## How was this patch tested?

Tested manually.

Author: Dilip Biswal <dbiswal@us.ibm.com>

Closes #17214 from dilipbiswal/analyis_twice.

(cherry picked from commit d809cee)
Signed-off-by: Xiao Li <gatorsmile@gmail.com>
@asfgit asfgit closed this in d809cee Mar 9, 2017
@dilipbiswal
Copy link
Contributor Author

Thank you @gatorsmile @cloud-fan @hvanhovell

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants