The ProjectionPlan::project_batch function is inefficient and noisy

In take workloads we utilize a function called `ProjectionPlan::project_batch`.  This function creates a `OneShotExec` that reads in the batch and a `ProjectExec` that applies the projection expressions.  This is probably fine if there are actually projection expressions to evaluate.  However, if we all we are doing is reordering or dropping columns (or especially if the projection is an identity projections 🤦) we are adding quite a bit of unneeded overhead.

To put this into context, in a recent random access benchmarking effort I found this to be responsible for about 8% of the latency even though there was no actual projection (it was just identity).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The ProjectionPlan::project_batch function is inefficient and noisy #5069

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

The ProjectionPlan::project_batch function is inefficient and noisy #5069

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions