Skip to content

Use object_store:BufWriter instead of put_multipart #9614

Closed
@tustvold

Description

@tustvold

Is your feature request related to a problem or challenge?

Currently in many places we use put_multipart for streaming writes. When writing files smaller than 10MiB this is wasteful, as it performs 3 requests when 1 would suffice.

Describe the solution you'd like

object_store 0.9.1 added https://docs.rs/object_store/latest/object_store/buffered/struct.BufWriter.html which can automatically switch between using Put and PutMultipart based on the amount of data that has been written

Describe alternatives you've considered

We could implement our own adaptive logic in the write path within DF

Additional context

A future version of object_store is likely to significantly change put_multipart, and using BufWriter will limit the impact of this - apache/arrow-rs#5500

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions