-
Notifications
You must be signed in to change notification settings - Fork 730
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix hang in bulk helper semaphore when server responses are slower than flushInterval #2027
Conversation
Before it was waiting until after semaphore resolved, then sending with a reference to bulkBody. If flushInterval is reached after `await semaphore()` but before `send(bulkBody)`, onFlushTimeout is "stealing" bulkBody so that there is nothing left in bulkBody for the flushBytes block to send, causing an indefinite hang for a promise that does not resolve.
I've published this to npm as |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your explanation makes sense, but I'm unable to map it back to the code, sorry.
@pquentin Thanks. 🖤 Mostly tagged dynamic clients team members for visibility because this bug came up in conversation in our Monday team meeting. It still bends my mind a bit, too, and I can't find a clear explanation for why an empty |
This pull request is stale because it has been open 90 days with no activity. Remove the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The solution proposed to store the bulkBody
before the await semaphore()
makes sense. I'm wondering if the issue came because of the slice() usage, since it's create a shallow copy of the array.
…an flushInterval (#2027) * Set version to 8.10.1 * Add tests for bulk helper with various flush and server timeouts * Copy and empty bulkBody when flushBytes is reached Before it was waiting until after semaphore resolved, then sending with a reference to bulkBody. If flushInterval is reached after `await semaphore()` but before `send(bulkBody)`, onFlushTimeout is "stealing" bulkBody so that there is nothing left in bulkBody for the flushBytes block to send, causing an indefinite hang for a promise that does not resolve. * comment typo fixes --------- Co-authored-by: Quentin Pradet <quentin.pradet@elastic.co> (cherry picked from commit 1607a0d)
…an flushInterval (#2027) (#2129) * Set version to 8.10.1 * Add tests for bulk helper with various flush and server timeouts * Copy and empty bulkBody when flushBytes is reached Before it was waiting until after semaphore resolved, then sending with a reference to bulkBody. If flushInterval is reached after `await semaphore()` but before `send(bulkBody)`, onFlushTimeout is "stealing" bulkBody so that there is nothing left in bulkBody for the flushBytes block to send, causing an indefinite hang for a promise that does not resolve. * comment typo fixes --------- Co-authored-by: Quentin Pradet <quentin.pradet@elastic.co> (cherry picked from commit 1607a0d) Co-authored-by: Josh Mock <joshua.mock@elastic.co>
Applies patches from elastic/elasticsearch-js#2199 and elastic/elasticsearch-js#2027, adding support for an onSuccess callback and fixing a bug that would cause the helper to hang when the flushInterval was lower than the request timeout. --------- Co-authored-by: JoshMock <160161+JoshMock@users.noreply.github.com>
A possible fix for #1562.
The failing state in the above issue is reached when a server's response times are slower than
flushInterval
. What happens in this situation, in this order:flushBytes
block awaitssemaphore()
onFlushTimeout
is triggeredonFlushTimeout
"steals" the bulk body rows that theflushBytes
block is waiting to send by emptyingbulkBody
before awaiting a semaphoreflushBytes
block's semaphore resolves, it tries tosend(bulkBody)
butbulkBody
is emptyBy having
flushBytes
set its blocks aside before awaiting the semaphore, likeonFlushTimeout
already does, it appears to solve the situation where a Promise sits unresolved indefinitely.@delvedor I'm particularly interested in your thoughts on this solution, since you wrote the helper and chose the semaphore strategy.