Alternative approaches to "fan-out" style RepartitionExec

### Is your feature request related to a problem or challenge?

`RepartitionExec` is often used to fan out batches from a single partition into multiple partitions.  For example, if we are scanning a very big parquet file we use the `RepartitionExec` to take the batches we receive from the Parquet file and fan it out to multiple partitions so that the data can be processed in parallel.  Note: these `RepartitionExec` are often not setup by hand but rather inserted by the plan optimizer.

The current approach sets up a channel per partition and (I believe) emits batches in a round-robin order.  This works well when the consumer is faster than the producer (typical in simple queries) or the workload is evenly balanced.  However, when the workload is skewed this leads to problems.

* The query can use too much memory because the data builds up in the channels of the slower consumers.  (Note: if all consumers are slow then all outputs will fill and that _does_ trigger the producer to pause).
* There are potential performance disadvantages because we have cores that are ready to do processing but their queue is empty and meanwhile there are cores that are busy and have deep queues.

### Describe the solution you'd like

Work stealing queues come to mind.  I think there's a some literature on putting these to use in databases.  They can be designed fairly efficiently.  Maybe there are some solid Rust implementations (building one from scratch might be a bit annoying).

Otherwise, a simple and slow mutex-bound MPMC queue might be a nice alternative to at least avoid the memory issues (if not fix the performance issues).

There could be plenty of other approaches as well.

### Describe alternatives you've considered

I don't have a good workaround at the moment.

### Additional context

A smattering of Discord conversation: https://discord.com/channels/885562378132000778/1331914577935597579

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Alternative approaches to "fan-out" style RepartitionExec #14287

Is your feature request related to a problem or challenge?

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Alternative approaches to "fan-out" style RepartitionExec #14287

Description

Is your feature request related to a problem or challenge?

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions