Skip to content

Conversation

@shruti2522
Copy link
Contributor

@shruti2522 shruti2522 commented Mar 19, 2025

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Yes

Are there any user-facing changes?

Yes

@github-actions github-actions bot added sqllogictest SQL Logic Tests (.slt) functions Changes to functions implementation labels Mar 19, 2025
@shruti2522 shruti2522 changed the title Improve performance of min/max aggregates for Durations via GroupsAccumulator GroupsAccumulator for Duration Mar 19, 2025
@shruti2522 shruti2522 changed the title GroupsAccumulator for Duration Implement GroupsAccumulator for Duration Mar 19, 2025
@shruti2522 shruti2522 force-pushed the groups_accumulator-duration branch 4 times, most recently from 876425e to d853365 Compare March 20, 2025 11:34
@shruti2522 shruti2522 changed the title Implement GroupsAccumulator for Duration Implement GroupsAccumulator for min/max Duration Mar 20, 2025
@shruti2522 shruti2522 force-pushed the groups_accumulator-duration branch from 8ef8afc to 406aa8b Compare March 20, 2025 12:52
@shruti2522 shruti2522 marked this pull request as ready for review March 20, 2025 12:53
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @shruti2522 -- this looks great -- I think it just needs one more change

Timestamp(Nanosecond, _) => {
primitive_max_accumulator!(data_type, i64, TimestampNanosecondType)
}
Duration(Second) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you need to add Duration to the list of types supported in groups_accumulator_supported as well.

| Binary
| LargeBinary
| BinaryView
| Duration(_)
Copy link
Contributor Author

@shruti2522 shruti2522 Mar 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you need to add Duration to the list of types supported in groups_accumulator_supported as well.

Hi @alamb, I have added it here and similarly for Min

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry -- I didn't see that for some reason. Thank you 🙏

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @shruti2522 -- looks great 🙏

@alamb alamb added the performance Make DataFusion faster label Mar 21, 2025
@alamb alamb merged commit cafe0e7 into apache:main Mar 21, 2025
27 checks passed
emilk pushed a commit to rerun-io/arrow-datafusion that referenced this pull request Apr 1, 2025
emilk pushed a commit to rerun-io/arrow-datafusion that referenced this pull request Apr 1, 2025
emilk pushed a commit to rerun-io/arrow-datafusion that referenced this pull request Apr 1, 2025
avantgardnerio pushed a commit to coralogix/arrow-datafusion that referenced this pull request Oct 1, 2025
avantgardnerio pushed a commit to coralogix/arrow-datafusion that referenced this pull request Oct 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

functions Changes to functions implementation performance Make DataFusion faster sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Improve performance of min/max aggregates for Durations via GroupsAccumulator

2 participants