[EventHubs] Buffered Producer back-pressure handling strategy #23909

yunhaoling · 2022-04-08T18:16:35Z

Issue: Back-pressure

How should the EventHubProducerClient in buffered mode behave against the scenario of enqueuing events when the buffer is full/not having enough room for the coming events.

(Originally from comment in PR)

Anna's comment:

What's interesting is that triggering this flush will be blocking....
I wonder if the background worker should be monitoring the buffer size and trigger a background flush when this size is hit - rather than waiting until the next put_events request to see that it's full and block until it's flushed?
What if I didn't set a max_wait_time, only a max_buffer_length - but then only add an event once per minute? We wouldn't check if the buffer was full until a minute after it had been full?
Looks like this can't happen as both max_wait_time and max_buffer_length must both have values - though depending on those values, we could end up waiting before we realize the buffer has been sitting full since the last event was added.

sample cases:

max_buffer_length is 100, there's 50 events in the buffer, and 60 coming

Scope

the timeout behavior
the back-pressure handling strategy

Current implementation/behavior:

when EventHubProducerClient works in buffered mode, and.send_event/batch is called with timeout, timeout is used to control the enqueue operation, not the send operation.

if there're fewer slots in the buffer than the incoming events, flush first with the given timeout, then check timeout, if times out then we raise error, else we put events into buffer.

pseudocode:

if there_is_room_in_buffer:
    enqueue_events()
else:
    flush(timeout)  # clear the buffer
    if check_time_out_after_flush:
        raise timeout
    else:
        enqueue_events()

Other option

In general, we need a way for EventHubProducerClient.send_event/batch in buffered mode to flush the buffer queue.
if not via proactive call, we could use flag/condition to decouple send_event/batch and the flush operation.

we could add another flag and reuse the check_max_wait_time_future thread for flushing, it's almost the same effect as the current implementation, but decouple flush from enqueue_events. but this way, the timing is not perfectly controlled as check_max_wait_time_future will sleep periodically or we change the sleep period simultaneously.

self._need_flush = threading.Event()  # or Condition

if there_is_room_in_buffer:
    enqueue_events()
else:
    try:
        self._need_flush.set()
        self._not_full.wait()
        if check_time_out_after_flush:
            raise timeout
        else:
            enqueue_events()
    finally:
        self._need_flush = False

we introduce throttling mechanism.

having another background task monitoring the load of the buffer, if it's 70% full, then we flush

hybrid 1 + 2

option2 still doesn't handle well with corner cases like if the buffer is 65% full, however the incoming events will take more 35% of the capacify, in this scenario, we still need flush.

Summary

the proactive flush should be good to start with, it's easy to explain and the implementation straightforward.
However, depending on the changing customer requirements, we might want to consider other options.

The text was updated successfully, but these errors were encountered:

ghost added the needs-triage Workflow: This is a new issue that needs to be triaged to the appropriate team. label Apr 8, 2022

azure-sdk added Client This issue points to a problem in the data-plane of the library. Event Hubs needs-team-triage Workflow: This issue needs the team to triage. labels Apr 8, 2022

ghost removed the needs-triage Workflow: This is a new issue that needs to be triaged to the appropriate team. label Apr 8, 2022

yunhaoling added the Messaging Messaging crew label Apr 8, 2022

yunhaoling mentioned this issue Apr 8, 2022

[EventHubs] Buffered Producer Implementation #23748

Merged

6 tasks

yunhaoling removed the needs-team-triage Workflow: This issue needs the team to triage. label Apr 8, 2022

yunhaoling assigned kashifkhan Apr 13, 2022

yunhaoling added this to the Backlog milestone Apr 13, 2022

kashifkhan modified the milestones: Backlog, [2022] May Apr 26, 2022

kashifkhan mentioned this issue May 9, 2022

[EventHubs] Buffered Producer #24362

Merged

lmazuel modified the milestones: [2022] May, [2022] June May 16, 2022

kashifkhan closed this as completed Jul 8, 2022

github-actions bot locked and limited conversation to collaborators Apr 11, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[EventHubs] Buffered Producer back-pressure handling strategy #23909

[EventHubs] Buffered Producer back-pressure handling strategy #23909

yunhaoling commented Apr 8, 2022 •

edited

Loading

[EventHubs] Buffered Producer back-pressure handling strategy #23909

[EventHubs] Buffered Producer back-pressure handling strategy #23909

Comments

yunhaoling commented Apr 8, 2022 • edited Loading

Issue: Back-pressure

sample cases:

Scope

Current implementation/behavior:

pseudocode:

Other option

Summary

yunhaoling commented Apr 8, 2022 •

edited

Loading