Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make ArragAgg (not ordered or distinct) into a UDAF #11045

Closed
wants to merge 6 commits into from

Conversation

eejbyfeldt
Copy link
Contributor

Which issue does this PR close?

Part of #8708

Closes #10999.

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Existing tests.

Are there any user-facing changes?

This adds UDAF for ARRAY_AGG when it not disitinct or ordered. The distinct and ordered will be handled in future PRs.

@github-actions github-actions bot added sql SQL Planner logical-expr Logical plan and expressions physical-expr Physical Expressions core Core DataFusion crate labels Jun 21, 2024
@@ -37,7 +37,7 @@ async fn csv_query_array_agg_distinct() -> Result<()> {
Schema::new(vec![Field::new_list(
"ARRAY_AGG(DISTINCT aggregate_test_100.c2)",
Field::new("item", DataType::UInt32, true),
false
true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think nullable should be fixed together otherwise the test seems like regression to me 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed that we need further changes to resolve this one. See my comment here: #11063

let simplify = |aggregate_function: AggregateFunction, _: &dyn SimplifyInfo| {
if aggregate_function.order_by.is_some() || aggregate_function.distinct {
Ok(Expr::AggregateFunction(AggregateFunction {
func_def: AggregateFunctionDefinition::BuiltIn(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This's cool 😎

This follows how it done for input_type and only provide a single value.
But might need to be changed into a Vec in the future.

This is need when we are moving `arrag_agg` to udaf where one of the
states nullability will depend on the nullability of the input.
@alamb alamb marked this pull request as draft June 22, 2024 11:37
@alamb
Copy link
Contributor

alamb commented Jun 22, 2024

Thanks @eejbyfeldt and @jayzhan211 -- this PR seems to be failing CI so I am marking it as a draft until that gets sorted out. I am working on reducing the review backlog

@eejbyfeldt eejbyfeldt closed this Jul 1, 2024
@eejbyfeldt eejbyfeldt deleted the i10999-only-array-agg branch July 1, 2024 20:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core DataFusion crate logical-expr Logical plan and expressions physical-expr Physical Expressions sql SQL Planner
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Convert ArrayAgg to UDAF
3 participants