Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(queue): add failing test for edge case max_batch_size=max_entries #11431

Merged
merged 1 commit into from
Aug 22, 2023

Conversation

hanshuebner
Copy link
Contributor

Summary

When defining a very special edge case configuration having max_batch_size=max_entries, the queue can fail with an assertion error when removing the frontmost element. This happens especially when the callback repeatedly fails (eg. an unavailable backend system receiving data).

What happens:

  1. we add max_batch_size elements, all of which "post" resources
  2. the batch queue consumes all of those resources in process_once by wait()ing for them, but gets stuck processing/sending the batch
  3. as process_once is stuck until max_retry_time passed, the function does not run delete_frontmost_entry() and thus actually moves the front reference
  4. when enqueuing the next item, it tries to drop the oldest entry, but triggers the assertion in queue.lua as no resources are left

This commit fixes #11377 by removing currently processed elements out of the race condition window.

Checklist

Issue reference

#11377

When defining a very special edge case configuration having
max_batch_size=max_entries, the queue can fail with an assertion error when
removing the frontmost element. This happens especially when the
callback repeatedly fails (eg. an unavailable backend system receiving
data).

What happens:

1. we add max_batch_size elements, all of which "post" resources
2. the batch queue consumes all of those resources in `process_once` by `wait()`ing for them, but gets stuck processing/sending the batch
3. as `process_once` is stuck until `max_retry_time` passed, the function does not run `delete_frontmost_entry()` and thus actually moves the `front` reference
4. when enqueuing the next item, it tries to drop the oldest entry, but triggers the assertion in queue.lua as no resources are left

This commit fixes #11377 by removing currently processed elements out
of the race condition window.
@kikito kikito merged commit 8ce3508 into master Aug 22, 2023
25 checks passed
@kikito kikito deleted the bug/queue-assertion-fail branch August 22, 2023 10:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

New queue fails in edge case max_batch_size=max_entries with assertion error
3 participants