[C++][FS][Azure] Optimise `ObjectAppendStream::DoAppend` in the case of many small appends #40036

Tom-Newton · 2024-02-11T14:07:24Z

Describe the enhancement requested

Optimisation to #38333
Child of #18014

Currently ObjectAppendStream::DoAppend calls block_blob_client_->StageBlock synchronously meaning that the call to ObjectAppendStream::DoAppend blocks until the data has been successfully written to blob storage. This is very in-efficient for large numbers of small writes.

This performance problem is actually quite obvious just in small tests against azurite. The UploadLines function used to create test data uses std::accumulate and writes the data in one call for performance reasons.

With accumulate

[ RUN      ] TestAzuriteFileSystem.OpenInputFileMixedReadVsReadAt
[       OK ] TestAzuriteFileSystem.OpenInputFileMixedReadVsReadAt (1350 ms)

without accumulate (4096 separate calls to ObjectAppendStream::DoAppend).

[ RUN      ] TestAzuriteFileSystem.OpenInputFileMixedReadVsReadAt
[       OK ] TestAzuriteFileSystem.OpenInputFileMixedReadVsReadAt (25124 ms)

And this is when testing against azurite on localhost so against real blob storage where the latency is going to be much higher the problem will be exacerbated.

By comparison the GCS filesystem is able to handle the later approach without performance issues.

Some options to optimise:

Call block_blob_client_->StageBlock asynchronously and await all the futures in ObjectAppendStream::Flush.
Buffer small writes in memory and make fewer larger calls to block_blob_client_->StageBlock.
Buffer small writes in memory and make batched calls to block_blob_client_->StageBlock.

Component(s)

C++

The text was updated successfully, but these errors were encountered:

pitrou · 2024-02-14T11:06:38Z

You should definitely buffer small writes in memory. The S3 filesystem does that.

You should also optionally allow async writes, as in S3, and have Flush ensure all writes have finished.

OliLay · 2024-06-04T11:52:32Z

Is there anybody working on this issue? It's kind of a blocking issues as it makes writing to Azure Blob quite slow.

pitrou · 2024-06-04T12:10:02Z

@OliLay Does using a BufferedOutputStream fix the problem for you?

Though I agree this should probably be handled transparently inside the Azure FS implementation. @felipecrv

OliLay · 2024-06-04T13:22:30Z

Thanks, I haven't tried that so far, but I guess this won't fully solve the problem as writing on the stream would still block once the buffer size is exceeded as we call write on the underlying Azure output stream. So I agree that probably a non-blocking write mode + internal buffering (as it is implemented for S3) would be the best approach.

pitrou · 2024-06-04T14:43:05Z

Agreed!

OliLay · 2024-07-01T14:55:17Z

I opened a PR which addresses this issue: #43096

) ### Rationale for this change See #40036. ### What changes are included in this PR? Write buffering and async writes (similar to what the S3 file system does) in the `ObjectAppendStream` for the Azure file system. With write buffering and async writes, the input scenario creation runtime in the tests (which uses the `ObjectAppendStream` against Azurite) decreased from ~25s (see [here](#40036)) to ~800ms: ``` [ RUN ] TestAzuriteFileSystem.OpenInputFileMixedReadVsReadAt [ OK ] TestAzuriteFileSystem.OpenInputFileMixedReadVsReadAt (787 ms) ``` ### Are these changes tested? Added some tests with background writes enabled and disabled (some were taken from the S3 tests). Everything changed should be covered. ### Are there any user-facing changes? `AzureOptions` now allows for `background_writes` to be set (default: true). No breaking changes. ### Notes - The code in `DoWrite` is very similar to [the code in the S3 FS](https://github.com/apache/arrow/blob/edfa343eeca008513f0300924380e1b187cc976b/cpp/src/arrow/filesystem/s3fs.cc#L1753). Maybe this could be unified? I didn't see this in the scope of the PR though. * GitHub Issue: #40036 Lead-authored-by: Oliver Layer <o.layer@celonis.de> Co-authored-by: Antoine Pitrou <antoine@python.org> Signed-off-by: Antoine Pitrou <antoine@python.org>

pitrou · 2024-08-21T11:23:04Z

Issue resolved by pull request 43096
#43096

Tom-Newton added the Type: enhancement label Feb 11, 2024

github-actions bot added the Component: C++ label Feb 11, 2024

kou mentioned this issue Feb 12, 2024

[C++] Filesystem implementation for Azure Blob Storage #18014

Closed

OliLay mentioned this issue Jul 1, 2024

GH-40036: [C++] Azure file system write buffering & async writes #43096

Merged

github-actions bot assigned OliLay Jul 1, 2024

pitrou added this to the 18.0.0 milestone Aug 21, 2024

pitrou closed this as completed Aug 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[C++][FS][Azure] Optimise `ObjectAppendStream::DoAppend` in the case of many small appends #40036

[C++][FS][Azure] Optimise `ObjectAppendStream::DoAppend` in the case of many small appends #40036

Tom-Newton commented Feb 11, 2024 •

edited

Loading

pitrou commented Feb 14, 2024

OliLay commented Jun 4, 2024

pitrou commented Jun 4, 2024

OliLay commented Jun 4, 2024

pitrou commented Jun 4, 2024

OliLay commented Jul 1, 2024

pitrou commented Aug 21, 2024

[C++][FS][Azure] Optimise ObjectAppendStream::DoAppend in the case of many small appends #40036

[C++][FS][Azure] Optimise ObjectAppendStream::DoAppend in the case of many small appends #40036

Comments

Tom-Newton commented Feb 11, 2024 • edited Loading

Describe the enhancement requested

Component(s)

pitrou commented Feb 14, 2024

OliLay commented Jun 4, 2024

pitrou commented Jun 4, 2024

OliLay commented Jun 4, 2024

pitrou commented Jun 4, 2024

OliLay commented Jul 1, 2024

pitrou commented Aug 21, 2024

[C++][FS][Azure] Optimise `ObjectAppendStream::DoAppend` in the case of many small appends #40036

[C++][FS][Azure] Optimise `ObjectAppendStream::DoAppend` in the case of many small appends #40036

Tom-Newton commented Feb 11, 2024 •

edited

Loading