Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -93,7 +93,7 @@ abstract class Optimizer(sessionCatalog: SessionCatalog)
RewriteCorrelatedScalarSubquery,
EliminateSerialization,
RemoveRedundantAliases,
RemoveRedundantProject,
RemoveNoopOperators,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: RemoveUselessOperators?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Noop is fine too. It's no-op, right :-)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes it is fine too, I preferred Useless because they are actually doing something, so they introduce a useless overhead, anyway not a big deal

SimplifyExtractValueOps,
CombineConcats) ++
extendedOperatorOptimizationRules
Expand Down Expand Up @@ -177,7 +177,7 @@ abstract class Optimizer(sessionCatalog: SessionCatalog)
RewritePredicateSubquery,
ColumnPruning,
CollapseProject,
RemoveRedundantProject) :+
RemoveNoopOperators) :+
Batch("UpdateAttributeReferences", Once,
UpdateNullabilityInAttributeReferences)
}
Expand Down Expand Up @@ -403,11 +403,15 @@ object RemoveRedundantAliases extends Rule[LogicalPlan] {
}

/**
* Remove projections from the query plan that do not make any modifications.
* Remove no-op operators from the query plan that do not make any modifications.
*/
object RemoveRedundantProject extends Rule[LogicalPlan] {
object RemoveNoopOperators extends Rule[LogicalPlan] {
def apply(plan: LogicalPlan): LogicalPlan = plan transform {
case p @ Project(_, child) if p.output == child.output => child
// Eliminate no-op Projects
case p @ Project(_, child) if child.sameOutput(p) => child

// Eliminate no-op Window
case w: Window if w.windowExpressions.isEmpty => w.child
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cloud-fan . Is this too small to move out as a separate file during this refactoring?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's small and it has been here for a while. We can move many rules here to separated files in another PR.


Expand Down Expand Up @@ -602,17 +606,12 @@ object ColumnPruning extends Rule[LogicalPlan] {
p.copy(child = w.copy(
windowExpressions = w.windowExpressions.filter(p.references.contains)))

// Eliminate no-op Window
case w: Window if w.windowExpressions.isEmpty => w.child

// Eliminate no-op Projects
case p @ Project(_, child) if child.sameOutput(p) => child

// Can't prune the columns on LeafNode
case p @ Project(_, _: LeafNode) => p

// for all other logical plans that inherits the output from it's children
case p @ Project(_, child) =>
// Project over project is handled by the first case, skip it here.
case p @ Project(_, child) if !child.isInstanceOf[Project] =>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in case the child is a project, shall we anyway update it with c.output.filter(allReferences.contains)? I mean can we instead update the prunedChild method to check if c is a Project and behave accordingly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already handled project over project at L542

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I see, makes sense, thanks.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This requires a code comment. I believe the others will ask the same q when we read the problem again.

val required = child.references ++ p.references
if (!child.inputSet.subsetOf(required)) {
val newChildren = child.children.map(c => prunedChild(c, required))
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@ class ColumnPruningSuite extends PlanTest {
val batches = Batch("Column pruning", FixedPoint(100),
PushDownPredicate,
ColumnPruning,
RemoveNoopOperators,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this added to remove top Project('b :: Nil ...)? Without that, this test can be unchanged?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

without this, a lot more tests need to be updated...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, I see. :)

CollapseProject) :: Nil
}

Expand Down Expand Up @@ -340,10 +341,8 @@ class ColumnPruningSuite extends PlanTest {
test("Column pruning on Union") {
val input1 = LocalRelation('a.int, 'b.string, 'c.double)
val input2 = LocalRelation('c.int, 'd.string, 'e.double)
val query = Project('b :: Nil,
Union(input1 :: input2 :: Nil)).analyze
val expected = Project('b :: Nil,
Union(Project('b :: Nil, input1) :: Project('d :: Nil, input2) :: Nil)).analyze
val query = Project('b :: Nil, Union(input1 :: input2 :: Nil)).analyze
val expected = Union(Project('b :: Nil, input1) :: Project('d :: Nil, input2) :: Nil).analyze
comparePlans(Optimize.execute(query), expected)
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -27,8 +27,9 @@ class CombiningLimitsSuite extends PlanTest {

object Optimize extends RuleExecutor[LogicalPlan] {
val batches =
Batch("Filter Pushdown", FixedPoint(100),
ColumnPruning) ::
Batch("Column Pruning", FixedPoint(100),
ColumnPruning,
RemoveNoopOperators) ::
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Looks not precise to have RemoveNoopOperators in Column Pruning batch, but it is fine as this is just test.

Batch("Combine Limit", FixedPoint(10),
CombineLimits) ::
Batch("Constant Folding", FixedPoint(10),
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@ class JoinOptimizationSuite extends PlanTest {
ReorderJoin,
PushPredicateThroughJoin,
ColumnPruning,
RemoveNoopOperators,
CollapseProject) :: Nil

}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ class RemoveRedundantAliasAndProjectSuite extends PlanTest with PredicateHelper
FixedPoint(50),
PushProjectionThroughUnion,
RemoveRedundantAliases,
RemoveRedundantProject) :: Nil
RemoveNoopOperators) :: Nil
}

test("all expressions in project list are aliased child output") {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ class RewriteSubquerySuite extends PlanTest {
RewritePredicateSubquery,
ColumnPruning,
CollapseProject,
RemoveRedundantProject) :: Nil
RemoveNoopOperators) :: Nil
}

test("Column pruning after rewriting predicate subquery") {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ import org.apache.spark.sql.catalyst.rules.RuleExecutor
class TransposeWindowSuite extends PlanTest {
object Optimize extends RuleExecutor[LogicalPlan] {
val batches =
Batch("CollapseProject", FixedPoint(100), CollapseProject, RemoveRedundantProject) ::
Batch("CollapseProject", FixedPoint(100), CollapseProject, RemoveNoopOperators) ::
Batch("FlipWindow", Once, CollapseWindow, TransposeWindow) :: Nil
}

Expand Down