-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feature: Add a WindowUDFImpl::simplify() API #9906
Conversation
hi @jayzhan211 , maybe you can review this PR 😄 |
@guojidan I think we need a simple example. Something like simplifying |
separate create a new UDWF example file? |
We have example of You need to write an example in |
I agree with @jayzhan211 that an example would be great ❤️ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the contribution @guojidan
@@ -75,6 +78,30 @@ impl WindowUDFImpl for SmoothItUdf { | |||
Ok(DataType::Float64) | |||
} | |||
|
|||
/// rewrite | |||
fn simplify( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might be confusing as now this example UDWF will never call the actual implementation.
Perhaps we can make a second UDWF and use it as an example of wrapping an existing UDWF 🤔
datafusion/expr/src/udwf.rs
Outdated
fn simplify( | ||
&self, | ||
args: Vec<Expr>, | ||
_partition_by: &[Expr], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah this is interesting. I was trying to figure out how we would update this signature to avoid a copy (for example, a rewrite would have to copy the order_by
exprs 🤔
I can't think of anything at the moment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it looks quite hard to come up with better backward compatible solution
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be great if we would not need do to decompose and then compose function again, something like:
Expr::AggregateFunction(AggregateFunction {
func_def:AggregateFunctionDefinition::UDF(ref udaf),
ref args,
ref distinct,
ref filter,
ref order_by,
ref null_treatment }) => {
match udaf.simplify(args, distinct, &filter, &order_by, &null_treatment)? {
ExprSimplifyResult::Simplified(simplified) => Transformed::yes(simplified),
ExprSimplifyResult::Original(_) => Transformed::no(expr),
}
}
clone of args could be avoided if simplify returns empty vector ... but that breaks semantics of the api
datafusion/expr/src/udwf.rs
Outdated
args: Vec<Expr>, | ||
_partition_by: &[Expr], | ||
_order_by: &[Expr], | ||
_window_frame: &WindowFrame, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thats probably better to use builder pattern, if the mandatory and optional input params are different
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't quite understand what to do. Can you further explain 😄
Marking as draft as I think this PR is no longer waiting on feedback. Please mark it as ready for review when it is ready for another look |
Signed-off-by: guojidan <1948535941@qq.com>
I guess at some point we'd need to change |
|
||
/// this function will simplify `SimplifySmoothItUdf` to `SmoothItUdf`. | ||
fn simplify(&self) -> Option<WindowFunctionSimplification> { | ||
// Ok(ExprSimplifyResult::Simplified(Expr::WindowFunction( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remind to cleanup this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @guojidan and @jayzhan211 for the review
I think there are some cleanups (liek https://github.com/apache/datafusion/pull/9906/files#r1618798950 from @jayzhan211 ) that we should do but we can do them as follow on PRs too.
Thank you
* feature: Add a WindowUDFImpl::simplfy() API Signed-off-by: guojidan <1948535941@qq.com> * fix doc Signed-off-by: guojidan <1948535941@qq.com> * fix fmt Signed-off-by: guojidan <1948535941@qq.com> --------- Signed-off-by: guojidan <1948535941@qq.com>
Which issue does this PR close?
Closes #9527 .
Rationale for this change
allow user define simply rule
What changes are included in this PR?
add
WindowUDFImpl::simplfy()
APIAre these changes tested?
Are there any user-facing changes?