Skip to content

split_batch_for_grpc_response over-splits batches after IPC deserialization #9388

@Tristan1900

Description

@Tristan1900

Describe the bug

split_batch_for_grpc_response uses this to estimate batch size for splitting. This calls buffer.capacity(), which reports the full backing allocation size, not the actual data size.
After IPC deserialization, the IPC reader creates all buffers by slicing one shared Buffer via this, which clones the Arc. So every buffer's capacity() reports the entire message body size rather than its own slice.

To Reproduce

FlightDataEncoderBuilder encode -> FlightRecordBatchStream decode -> FlightDataEncoderBuilder re-encode produces more FlightData than just FlightDataEncoderBuilder encode

Expected behavior

Additional context

#5352

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions