-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rename input_type
--> input_types
on AggregateFunctionExpr / AccumulatorArgs / StateFieldsArgs
#11666
Conversation
input_type
--> input_types
om AggregateFunctionExpr / AccumulatorArgs / StateFieldsArgs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @lewiszlw -- even though this is an API change I agree it makes the code less confusing
This confused me too the first time when I looked at code in user-defined aggregates. I thought the reason it was written like that was because for all existing user-defined aggregates, knowing just the type of the first argument is enough. |
@lewiszlw It will be very helpful if you could please also update this example.
I checked your branch and this file doesn't exist. So you will need to merge from main to get it. |
Thanks for pointing out. I'll update pr in a few days. |
I happened to have this PR opened locally (I get anxious with PRs that are open too long 😅 ) so I took the liberty of updating the docs as well in 1a3c5ca while I was merging up from main |
I think we don't even need vector of input type, since we only get the first one. 🤔 |
input_type
--> input_types
om AggregateFunctionExpr / AccumulatorArgs / StateFieldsArgsinput_type
--> input_types
on AggregateFunctionExpr / AccumulatorArgs / StateFieldsArgs
Wouldn't we need a vector of input types if the aggregate had more than one argument? For example |
Second argument is fixed type (f64), so we don't have the actual function that expect different multiple input type yet. Therefore, I suggest we remove |
That certainly sounds cleaner. I feel like we have churned this API a bunch recently, so maybe we can take a step back and ensure that whatever we come up with supports the usecases we know of now (and in the future) so we don't keep changing it 🤔 |
I plan to change I think it would be nice to have something like |
The end state in my mind now. pub struct AccumulatorArgs<'a> {
/// Keep, this is return type, the name might be quite confusing.
pub data_type: &'a DataType,
/// We might only need one of `schema` or `dfschema`. It is likely we keep `dfschema`, since we can get `schema` from it.
pub schema: &'a Schema,
pub dfschema: &'a DFSchema,
/// Keep
pub ignore_nulls: bool,
/// Convert to physical sort exprs instead
pub sort_exprs: &'a [Expr],
/// Keep
pub is_reversed: bool,
/// We might able to get the name from expressions
pub name: &'a str,
/// Keep
pub is_distinct: bool,
/// Get the type from schema and expressions
pub input_type: &'a DataType,
/// Convert to physical expressions
pub input_exprs: &'a [Expr],
} |
I agree
I think Further evidence that we don't need dfschema is that you can get a
Agree
Make sense Thanks @jayzhan211 |
So what should we do with this PR? Merge it and then revamp in a follow on PR? |
Sure, but it is better to remove |
@xinlifoobar I think it is not clear whether we should keep
|
Which issue does this PR close?
Closes #.
Rationale for this change
It confused me when I read these code that AggregateFunctionExpr / AccumulatorArgs / StateFieldsArgs only contain one input type.
What changes are included in this PR?
Are these changes tested?
Are there any user-facing changes?