-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move conversion of FIRST/LAST Aggregate function to independent physical optimizer rule #9972
Comments
@alamb The design differs from the scalar function, so I'm unsure if it makes sense. |
I don't understand why it wouldn't work if we kept Since I may be missing something though |
To support // fun: AggregateUDF,
fn reverse_expr(&self) -> Option<Arc<dyn AggregateExpr>> {
self.fun.reverse_expr()
} Therefore, we need to have To be able to return I agree that Possible solution
|
I agree
I wonder if we could have the function return trait AggregateUDFImpl {
...
// fun: AggregateUDF,
fn reverse_expr(&self) -> Option<AggregateUDF>; And then we can reconstruct the AggregateExpr from there? My thinking is that once we have ported all aggregates to AggregateUDF then this would be the only way to implement Aggregates |
Not AggregateUDF because it does not have ordering info. But I think we can convert to How about we introduce a UDF version for |
Does it need the ordering info? Or could we keep the code that makes the new Doesn't reverse simply need to needs to know what function to use when if the sort order is "reversed"? impl AggregateUDFImpl {
...
/// returns the "reverse" of this aggregate. Reverse means the function to use when the input ordering is
/// reversed. For example, `first_value` can be reversed to `last_value` if the ordering is reversed.
/// if the function is its own reverse,
fn reverse(&self) -> Option<Arc<Aggregate>> This seems to mirror how the current AggregateExpr works: |
We create Last with reversed ordering. // impl AggregateExpr for FirstValuePhysicalExpr
fn reverse_expr(&self) -> Option<Arc<dyn AggregateExpr>> {
Some(Arc::new(self.clone().convert_to_last()))
}
pub fn convert_to_last(self) -> LastValuePhysicalExpr {
let name = if self.name.starts_with("FIRST") {
format!("LAST{}", &self.name[5..])
} else {
format!("LAST_VALUE({})", self.expr)
};
let FirstValuePhysicalExpr {
expr,
input_data_type,
ordering_req,
order_by_data_types,
..
} = self;
LastValuePhysicalExpr::new(
expr,
name,
input_data_type,
reverse_order_bys(&ordering_req),
order_by_data_types,
)
}
I think we need to move |
I don't know
Seems reasonable to me, but I didnt look carefully at the code Thank you for pursuing this @jayzhan211 |
AggregateUDFImpl
to functions-aggregate
get_aggregate_exprs_requirement
to PhysicalOptimizerRule
I planned to move out the code that computes The problem is that I need to ensure the rule let adjusted = if top_down_join_key_reordering {
// Run a top-down process to adjust input key ordering recursively
let plan_requirements = PlanWithKeyRequirements::new_default(plan);
let adjusted = plan_requirements
.transform_down(&adjust_input_keys_ordering)
.data()?;
adjusted.plan
} else {
// Run a bottom-up process
plan.transform_up(&|plan| {
Ok(Transformed::yes(reorder_join_keys_to_inputs(plan)?))
})
.data()?
};
// Expect simply_ordering here before `ensure_distribution` is called.
let distribution_context = DistributionContext::new_default(adjusted);
// Distribution enforcement needs to be applied bottom-up.
let distribution_context = distribution_context
.transform_up(&|distribution_context| {
ensure_distribution(distribution_context, config)
})
.data()?;
Ok(distribution_context.plan) I need to split this rule into two, so I can insert the The other approach is instead of having Since the UPDATE: |
Maybe @mustafasrepo has some ideas to share about this too, as he has worked extensively with that code and authored a significant amount of it |
get_aggregate_exprs_requirement
to PhysicalOptimizerRule
Sorry for the late reply. Since I was in vacation, couldn't look here.
As an example usecase: Consider the query
where
to align ordering requirement with existing ordering. Returning
where |
I see. It makes sense to me. |
Is your feature request related to a problem or challenge?
Parts of #8708
First / Last aggregate function has the method
fn reverse_expr(&self) -> Option<Arc<dyn AggregateExpr>>
that returns another newAggregateExpr
. We can't support this method sinceAggregateExpr
is infunctions-aggregate
(I plan to move it here in #9960, it doesn't work too if we keep it inphysical-expr-common
). I propose that we moveAggregateUDFImpl
tofunctions-aggregate
.The overall idea is that we move aggregation functions struct or trait including logical and physical to
functions-aggregate
. keep other common struct or trait indatafusion-expr
anddatafusion-physical-expr-common
respectively.Describe the solution you'd like
No response
Describe alternatives you've considered
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: