Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Python] Add a pyarrow.Table.aggregate function to compute aggregates against the whole table #14896

Open
westonpace opened this issue Dec 9, 2022 · 2 comments

Comments

@westonpace
Copy link
Member

Describe the enhancement requested

The implementation would be almost identical to pyarrow.TableGroupBy.aggregate except the keys list would be empty. Currently this does not appear to be possible using an empty array of keys:

>>> tab.group_by([]).aggregate([("x", "sum")])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "pyarrow/table.pxi", line 5325, in pyarrow.lib.TableGroupBy.aggregate
  File "pyarrow/_compute.pyx", line 2145, in pyarrow._compute._group_by
  File "pyarrow/error.pxi", line 144, in pyarrow.lib.pyarrow_internal_check_status
  File "pyarrow/error.pxi", line 100, in pyarrow.lib.check_status
pyarrow.lib.ArrowInvalid: Cannot infer ExecBatch length without at least one value

Even if it were possible, I don't think it would be obvious to obtain the result in this way.

Component(s)

Python

@westonpace westonpace changed the title [python] Add a pyarrow.Table.aggregate function to compute aggregates against the whole table [Python] Add a pyarrow.Table.aggregate function to compute aggregates against the whole table Dec 9, 2022
@westonpace
Copy link
Member Author

Note, we may want to wait until #14867 merges as the corresponding C++ functionality will be more obvious (and tested)

@coady
Copy link

coady commented Feb 11, 2023

Also related to #33832. Because if it's worth doing for tables, it would be even more valuable on datasets or scanners, for out-of-core aggregation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants