Skip to content

Conversation

@maropu
Copy link
Member

@maropu maropu commented Nov 20, 2019

What changes were proposed in this pull request?

This is to refactor Predicate code; it mainly removed newPredicate from SparkPlan.
Modifications are listed below;

  • Move Predicate from o.a.s.sqlcatalyst.expressions.codegen.GeneratePredicate.scala to o.a.s.sqlcatalyst.expressions.predicates.scala
  • To resolve the name conflict, rename o.a.s.sqlcatalyst.expressions.codegen.Predicate to o.a.s.sqlcatalyst.expressions.BasePredicate
  • Extend CodeGeneratorWithInterpretedFallback for BasePredicate

This comes from the @cloud-fan suggestion: #26420 (comment)

Why are the changes needed?

For better code/test coverage.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Existing tests.

@maropu
Copy link
Member Author

maropu commented Nov 20, 2019

@cloud-fan @viirya How about this refactoring?

@SparkQA
Copy link

SparkQA commented Nov 20, 2019

Test build #114125 has finished for PR 26604 at commit fcc9d22.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • class SpecificPredicate extends $
  • abstract class BasePredicate

@maropu
Copy link
Member Author

maropu commented Nov 20, 2019

IIUC newMutableProjection, newOrdering and newNaturalAscendingOrdering can be removed from SparkPlan, too.

@dongjoon-hyun
Copy link
Member

Hi, @maropu . So, are you going to make a few PR for them one by one?
cc @cloud-fan .


/**
* Returns an MutableProjection for given sequence of Expressions, which will be bound to
* Returns a MutableProjection for given sequence of Expressions, which will be bound to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not touch this file in this PR~

Copy link
Member Author

@maropu maropu Nov 20, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You meant a separate PR for this typo?

/**
* Interface for generated/interpreted predicate
*/
abstract class BasePredicate {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems reasonable because we renamed Predicate => BasePredicate before.



/**
* Interface for generated/interpreted predicate
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not wrong and this comes from the old comment. Shall we use A base class instead of Interface?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

str
}
logWarning(s"Codegen disabled for this expression:\n $logMessage")
InterpretedPredicate.create(expression, inputSchema)
Copy link
Member

@dongjoon-hyun dongjoon-hyun Nov 20, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't follow the context in the previous PR. Is this genInterpretedPredicate function unused?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, CodeGeneratorWithInterpretedFallback#createInterpretedObject does the same thing with genInterpretedPredicate :

@maropu
Copy link
Member Author

maropu commented Nov 20, 2019

Thanks for the comment, @dongjoon-hyun. I just wanted to check if this refactoring is right first, and all the tests passed for this fix because this change might increase the test coverage for interpreted predicates (e.g., in SQLQueryTestSuite). So, I'm ok that all the refactoring is done in this pr. Should I do so?


val boundPredicate =
InterpretedPredicate.create(predicates.reduce(And).transform {
Predicate.create(predicates.reduce(And).transform {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason to use the interpreted version here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sometimes it's better to use the interpreted version if the input data is known to be small. Codegen can be slower as compiling takes time.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, ok. I'll revert this.

This reverts commit 73588fc.
case Filter(condition, LocalRelation(output, data, isStreaming))
if !hasUnevaluableExpr(condition) =>
val predicate = InterpretedPredicate.create(condition, output)
val predicate = Predicate.create(condition, output)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to optimize local relation so perf doesn't matter too much. The change should be fine.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

val predicate = partitionPruningPredicates.reduce(expressions.And)

val boundPredicate = InterpretedPredicate.create(predicate.transform {
val boundPredicate = Predicate.create(predicate.transform {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to be safe, I think we should keep using the interpreted version, in case there are only a few partitions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

def create(expr: Expression): BasePredicate = {
create(expr)
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can add a method def createInterpreted... for places that want to use interpreted predicates.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea. ok.


val boundPredicate =
InterpretedPredicate.create(predicates.reduce(And).transform {
Predicate.createInterpretedPredicate(predicates.reduce(And).transform {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: double predicate looks weird, how about just Predicate.createInterpreted?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

expressions.GreaterThan(attribute, literal)
}.reduceOption(expressions.And).getOrElse(Literal(true))
InterpretedPredicate.create(filterCondition, inputAttributes)
Predicate.createInterpretedPredicate(filterCondition, inputAttributes)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a testing source, we don't care it's codegen or not. Should be fine to call Predicate.create

@SparkQA
Copy link

SparkQA commented Nov 20, 2019

Test build #114127 has finished for PR 26604 at commit 4b1e311.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • class SpecificPredicate extends $
  • abstract class BasePredicate

InterpretedPredicate(in)
}

def createInterpreted(e: Expression, inputSchema: Seq[Attribute]): InterpretedPredicate =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this method seems not used.

@SparkQA
Copy link

SparkQA commented Nov 20, 2019

Test build #114140 has finished for PR 26604 at commit 74183e6.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Nov 20, 2019

Test build #114135 has finished for PR 26604 at commit 0896afe.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Nov 20, 2019

Test build #114141 has finished for PR 26604 at commit 164507c.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Nov 20, 2019

Test build #114129 has finished for PR 26604 at commit 73588fc.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Nov 20, 2019

Test build #114132 has finished for PR 26604 at commit bed96f4.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@maropu
Copy link
Member Author

maropu commented Nov 20, 2019

retest this please

@SparkQA
Copy link

SparkQA commented Nov 20, 2019

Test build #114142 has finished for PR 26604 at commit 164507c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor

thanks, merging to master!

@cloud-fan cloud-fan closed this in 0032d85 Nov 20, 2019
@cloud-fan
Copy link
Contributor

@maropu can you create JIRA tickets to do the same thing for newMutableProjection, newOrdering and newNaturalAscendingOrdering?

@maropu
Copy link
Member Author

maropu commented Nov 20, 2019

Yea, sure. Better to fix them in a single next PR?

@cloud-fan
Copy link
Contributor

sounds good

@maropu
Copy link
Member Author

maropu commented Nov 20, 2019

ok

@beliefer
Copy link
Contributor

@maropu Thanks for your change . This PR makes my work easier.

changpingc added a commit to changpingc/GeoSpark that referenced this pull request Jul 9, 2020
newPredicate has been removed in Spark 3.0
apache/spark#26604
changpingc added a commit to changpingc/GeoSpark that referenced this pull request Aug 12, 2020
newPredicate has been removed in Spark 3.0
apache/spark#26604
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants