Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement array aggregate functions #7214

Open
20 tasks
izveigor opened this issue Aug 7, 2023 · 2 comments
Open
20 tasks

Implement array aggregate functions #7214

izveigor opened this issue Aug 7, 2023 · 2 comments
Labels
enhancement New feature or request

Comments

@izveigor
Copy link
Contributor

izveigor commented Aug 7, 2023

Is your feature request related to a problem or challenge?

Arrow DataFusion has a lot of aggregate functions for scalars and columns. We can compute an aggregate function with array by unnest funciton, but in my opinion it would be better to implement DuckDB methods to use different list_ aggregate functions.

List

The full list of array aggregate functions:

General:

  • array_avg (alias: list_avg)
  • array_bit_and (alias: list_bit_and)
  • array_bit_or (alias: list_bit_or)
  • array_bit_xor (alias: list_bit_xor)
  • array_bool_and (alias: list_bool_and)
  • array_bool_or (alias: list_bool_or)
  • array_count (alias: list_count)
  • array_max (alias: list_max)
  • array_mean (alias: list_mean)
  • array_median (alias: list_median)
  • array_min (alias: list_min)
  • array_sum (alias: list_sum)

Statistical:

  • array_stddev (alias: list_stddev)
  • array_stddev_pop (alias: list_stddev_pop)
  • array_stddev_samp (alias: list_stddev_samp)
  • array_var (alias: list_var)
  • array_var_pop (alias: list_var_pop)
  • array_var_samp (alias: list_var_samp)

Approximate:

  • array_approx_distinct (alias: list_approx_distinct)
  • array_approx_median (alias: list_approx_median)

Describe the solution you'd like

No response

Describe alternatives you've considered

No response

Additional context

DuckDB documentation: https://duckdb.org/docs/sql/functions/nested;
Apache Arrow DataFusion aggregate functions: https://arrow.apache.org/datafusion/user-guide/sql/aggregate_functions.html

@edmondop
Copy link
Contributor

This seems useful and something I can look into, can I pick it up @jayzhan211 ?

@jayzhan211
Copy link
Contributor

jayzhan211 commented Nov 12, 2023

This seems useful and something I can look into, can I pick it up @jayzhan211 ?

We have starting from sum but meet quite many challenges here. You can look into this first #7242. I plan to merge #7960 first and continue on #7242. You can welcome to pick any you are interesting in.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants