Splitting Schedulers used in BulkWriter between requests and responses #39260

FabianMeiswinkel · 2024-03-15T23:51:34Z

Description

BulkWriter has capped number of operations in flight already (enforced through a Semaphore) - currently for responses it is possible that the Scheduler rejected operations when the queue size of the bulkwriter bounded elastic reached its limit. Instead we should use onBackpressureBuffer there as well - we have capped the number of in-flight operations anyway - and rejection just requires retries.

This PR splits the schedulers for incoming requests and responses and ensures responses (like requests) are buffered unlimited via onBackpressureBuffer()

Sanity testing will happen via integration test in Databricks environment with account with high number of partitions.

All SDK Contribution checklist:

The pull request does not introduce [breaking changes]
CHANGELOG is updated for new features, bug fixes or other significant changes.
I have read the contribution guidelines.

General Guidelines and Best Practices

Title of the pull request is clear and informative.
There are a small number of commits, each of which have an informative message. This means that previously merged commits do not appear in the history of the PR. For more information on cleaning up the commits in your PR, see this page.

Testing Guidelines

Pull request includes test coverage for the included changes.

kushagraThapar

LGTM, thanks @FabianMeiswinkel

…to users/fabianm/BulkWriterSchedulerFix

xinlian12

LGTM, thanks

FabianMeiswinkel · 2024-03-17T14:35:00Z

/azp run java - cosmos - spark

azure-pipelines · 2024-03-17T14:35:08Z

Azure Pipelines successfully started running 1 pipeline(s).

Update BulkWriter.scala

2ed86c4

FabianMeiswinkel requested review from kushagraThapar, kirankumarkolli, xinlian12, milismsft, aayush3011, simorenoh, jeet1995 and Pilchie as code owners March 15, 2024 23:51

github-actions bot added the Cosmos label Mar 15, 2024

kushagraThapar approved these changes Mar 16, 2024

View reviewed changes

Merge branch 'main' of https://github.com/Azure/azure-sdk-for-java in…

d16837d

…to users/fabianm/BulkWriterSchedulerFix

xinlian12 approved these changes Mar 16, 2024

View reviewed changes

FabianMeiswinkel changed the title ~~Splitting Schedulers used in BulkWriter between requests and responses~~ Release azure-cosmos-spark 4.28.4 - Splitting Schedulers used in BulkWriter between requests and responses Mar 17, 2024

FabianMeiswinkel merged commit b3322e3 into Azure:main Mar 18, 2024
34 checks passed

FabianMeiswinkel changed the title ~~Release azure-cosmos-spark 4.28.4 - Splitting Schedulers used in BulkWriter between requests and responses~~ Splitting Schedulers used in BulkWriter between requests and responses Mar 18, 2024

drielenr pushed a commit that referenced this pull request Apr 2, 2024

Update BulkWriter.scala (#39260)

1c26b12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Splitting Schedulers used in BulkWriter between requests and responses #39260

Splitting Schedulers used in BulkWriter between requests and responses #39260

FabianMeiswinkel commented Mar 15, 2024

kushagraThapar left a comment

xinlian12 left a comment

FabianMeiswinkel commented Mar 17, 2024

azure-pipelines bot commented Mar 17, 2024

Splitting Schedulers used in BulkWriter between requests and responses #39260

Splitting Schedulers used in BulkWriter between requests and responses #39260

Conversation

FabianMeiswinkel commented Mar 15, 2024

Description

All SDK Contribution checklist:

General Guidelines and Best Practices

Testing Guidelines

kushagraThapar left a comment

Choose a reason for hiding this comment

xinlian12 left a comment

Choose a reason for hiding this comment

FabianMeiswinkel commented Mar 17, 2024

azure-pipelines bot commented Mar 17, 2024