Skip to content

Integrate Analyzer within LogicalPlan building stage #14618

@jayzhan211

Description

@jayzhan211

Is your feature request related to a problem or challenge?

The steps in Logical Layer is Sql->LogicalPlan->Analyzer->Optimizer.

These 5 rules are in Analyzer

            Arc::new(InlineTableScan::new()),
            // Every rule that will generate [Expr::Wildcard] should be placed in front of [ExpandWildcardRule].
            Arc::new(ExpandWildcardRule::new()),
            // [Expr::Wildcard] should be expanded before [TypeCoercion]
            Arc::new(ResolveGroupingFunction::new()),
            Arc::new(TypeCoercion::new()),
            Arc::new(CountWildcardRule::new()),

The role of the Analyzer is unclear to me. Having two types of "optimization" after the plan is completed doesn’t seem necessary. Instead, we should have one optimization step during plan construction and another after the plan is finalized. I believe these rules can be placed either in the SQL → LogicalPlan building stage or in the optimizer.

Comments of Analyzer

/// [`AnalyzerRule`]s transform [`LogicalPlan`]s in some way to make
/// the plan valid prior to the rest of the DataFusion optimization process.
///
/// `AnalyzerRule`s are different than an [`OptimizerRule`](crate::OptimizerRule)s
/// which must preserve the semantics of the `LogicalPlan`, while computing
/// results in a more optimal way.
///
/// For example, an `AnalyzerRule` may resolve [`Expr`](datafusion_expr::Expr)s into more specific
/// forms such as a subquery reference, or do type coercion to ensure the types
/// of operands are correct.

If a rule MUST be executed for plan validity (i.e. TypeCoercion), it should be applied during the plan creation stage, not after the plan is completed. However, if the rule is OPTIONAL for plan completion, it should be applied in the optimizer.

I propose removing the concept of the Analyzer and integrating it into the SQL → LogicalPlan stage. Specifically, TypeCoercion should be applied before the plan is finalized (#14380).

Before moving TypeCoercion into the builder, ExpandWildcardRule needs to be relocated first. The remaining three rules can be moved either into the builder or the optimize

Describe the solution you'd like

Requirement

Rules in the Analyzer are optional, allowing users to choose whether to apply them or add custom rules. This flexibility should be preserved, ensuring that the rule remains optional and customizable even after being moved out of the Analyzer.

Tasks

Describe alternatives you've considered

No response

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions