Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Epic] Unify WindowFunction Interface (remove built in list of BuiltInWindowFunction s) #8709

Open
11 of 12 tasks
alamb opened this issue Jan 1, 2024 · 5 comments
Open
11 of 12 tasks
Labels
enhancement New feature or request

Comments

@alamb
Copy link
Contributor

alamb commented Jan 1, 2024

Is your feature request related to a problem or challenge?

For many of the same reasons as listed on #8045, having two types of aggregate functions ("built in" --BuiltInWindowFunction and WindowUDF is problematic for two reasons:

  1. There are some features that may not be available to User Defined Window Functions (such as reversing FIRST_VALUE and LAST_VALUE)
  2. Users can not easily choose which window functions to include (which will likely be especially problematic as we work to add more functions)

Describe the solution you'd like

I propose moving DataFusion to only use WindowURFs and remove BuiltInWindowFunction for the same reasons as #8045

We will keep the existing WindowUDF interface as much as possible, while also potentially providing an easier way to define them.

Describe alternatives you've considered

Additional context

Proposed implementation steps:

@comphead
Copy link
Contributor

I'd like to try a small POC and migrate ROW_NUMBER to WindowUDF trait

@alamb
Copy link
Contributor Author

alamb commented Mar 21, 2024

I'd like to try a small POC and migrate ROW_NUMBER to WindowUDF trait

That would be awesome. Thank you

I recommend trying to put it in its own crate if possible (datfusion-window-functions perhaps?) but that doesn't have to be part of the POC

@timsaucer
Copy link
Contributor

I'm going to move back to working on datafusion-python for a bit, but would like to work on this afterwards.

Also I think we have some functions that are implemented in aggregate that should be moved to window. Namely first_value and last_value but it's worth taking a look through all the functions to see where they are defined.

Notes (for myself) on how postgresql divides them:

@jcsherin
Copy link
Contributor

jcsherin commented Oct 1, 2024

I filed a few more good first issues because of interest from new contributors,

@alamb
Copy link
Contributor Author

alamb commented Oct 1, 2024

Thank you @jcsherin -- I added it tothe list above

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants