Skip to content

Conversation

findepi
Copy link
Member

@findepi findepi commented Aug 11, 2025

Follows similar change for WindowUDFImpl, i.e. the 8494a39 commit.

Previously, the AggregateUDFImpl trait contained equals and hash_value methods with contracts following the Eq and Hash traits. However, the existence of default implementations of these methods made it error-prone, with many functions (scalar, aggregate, window) missing to customize the equals even though they ought to. There is no fix to this that's not an API breaking change, so a breaking change is warranted.

Removing the default implementations would be enough of a solution, but at the cost of a lot of boilerplate needed in implementations.

Instead, this removes the methods from the trait, and reuses DynEq, DynHash traits used previously only for physical expressions. This allows for functions to provide their implementations using no more than #[derive(PartialEq, Eq, Hash)] in a typical case.

@github-actions github-actions bot added documentation Improvements or additions to documentation logical-expr Logical plan and expressions optimizer Optimizer rules core Core DataFusion crate proto Related to proto crate functions Changes to functions implementation ffi Changes to the ffi crate labels Aug 11, 2025
@findepi findepi force-pushed the findepi/udaf-perfect-eq branch from fb676be to 8bf3c5d Compare August 11, 2025 19:51
Follows similar change for `WindowUDFImpl`, i.e. the
8494a39 commit.

Previously, the `AggregateUDFImpl` trait contained `equals` and
`hash_value` methods with contracts following the `Eq` and `Hash`
traits.  However, the existence of default implementations of these
methods made it error-prone, with many functions (scalar, aggregate,
window) missing to customize the equals even though they ought to.
There is no fix to this that's not an API breaking change, so a breaking
change is warranted.

Removing the default implementations would be enough of a solution, but
at the cost of a lot of boilerplate needed in implementations.

Instead, this removes the methods from the trait, and reuses `DynEq`,
`DynHash` traits used previously only for physical expressions. This
allows for functions to provide their implementations using no more than
`#[derive(PartialEq, Eq, Hash)]` in a typical case.
@findepi findepi force-pushed the findepi/udaf-perfect-eq branch from 8bf3c5d to 0671ef6 Compare August 11, 2025 19:59
@github-actions github-actions bot added the physical-expr Changes to the physical-expr crates label Aug 11, 2025
Copy link
Member

@timsaucer timsaucer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since all of the previous work has already merged, this PR is very simple and easy to follow. Thank you for all the work on this!

@alamb
Copy link
Contributor

alamb commented Aug 12, 2025

Looks good to me too -- thanks @findepi and @timsaucer

@findepi findepi merged commit b8bf7c5 into apache:main Aug 12, 2025
28 checks passed
@findepi findepi deleted the findepi/udaf-perfect-eq branch August 12, 2025 20:27
findepi added a commit to findepi/datafusion that referenced this pull request Aug 13, 2025
…e#17130)

Follows similar change for `WindowUDFImpl` and `AggregateUDFImpl`, i.e.
the 8494a39 and
b8bf7c5 commits.

Previously, the `ScalarUDFImpl` trait contained `equals` and
`hash_value` methods with contracts following the `Eq` and `Hash`
traits.  However, the existence of default implementations of these
methods made it error-prone, with many functions (scalar, aggregate,
window) missing to customize the equals even though they ought to.
There is no fix to this that's not an API breaking change, so a breaking
change is warranted.

Removing the default implementations would be enough of a solution, but
at the cost of a lot of boilerplate needed in implementations.

Instead, this removes the methods from the trait, and reuses `DynEq`,
`DynHash` traits used previously only for physical expressions. This
allows for functions to provide their implementations using no more than
`#[derive(PartialEq, Eq, Hash)]` in a typical case.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Core DataFusion crate documentation Improvements or additions to documentation ffi Changes to the ffi crate functions Changes to functions implementation logical-expr Logical plan and expressions optimizer Optimizer rules physical-expr Changes to the physical-expr crates proto Related to proto crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Replace AggregateUDFImpl::{equals,hash_value} with UdfHash, UdfEq traits Implement PartialEq, Hash for all UDAFs (AggregateUDFImpl)

3 participants