Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vtgate/buffer: Fix leakage of buffer pool slots due to canceled requests. #2599

Merged
merged 1 commit into from
Feb 26, 2017

Conversation

michael-berlin
Copy link
Contributor

Canceled requests, e.g. those with a deadline shorter than the failover duration, would not return their buffer pool slot. Eventually, all slots would have leaked and the buffer would start and stop buffering as usual but reject all requests immediately because it assumed there are other pending failovers which are holding the needed slots.

Previously, the code path for canceled requests differed from the common method unblockAndWait() which would unblock a request and also release its buffer pool slot after the request finished its retry. This made this oversight possible. I've changed this now: canceled requests also use unblockAndWait() now.

The code was also out of sync with the documentation for WaitForFailoverEnd(): The RetryDoneFunc() must not be returned when the function returns an error as well. However, for canceled requests both were returned.

…sts.

Canceled requests, e.g. those with a deadline shorter than the failover duration, would not return their buffer pool slot. Eventually, all slots would have leaked and the buffer would start and stop buffering as usual but reject all requests immediately because it assumed there are other pending failovers which are holding the needed slots.

Previously, the code path for canceled requests differed from the common method unblockAndWait() which would unblock a request and also release its buffer pool slot after the request finished its retry. This made this oversight possible. I've changed this now: canceled requests also use unblockAndWait() now.

The code was also out of sync with the documentation for WaitForFailoverEnd(): The RetryDoneFunc() must not be returned when the function returns an error as well. However, for canceled requests both were returned.
@michael-berlin michael-berlin merged commit 5705d03 into vitessio:master Feb 26, 2017
@michael-berlin michael-berlin deleted the fix_buffer_leak branch February 26, 2017 18:40
Copy link
Contributor

@sougou sougou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some drive-by comments.


// Size returns the current number of available slots.
func (sem *Semaphore) Size() int {
return len(sem.slots)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this racy, unless the language guarantees it. I couldn't find any text about this.

// returned when the request has finished. See also shardBuffer.unblockAndWait().
func waitForPoolSlots(b *Buffer, want int) error {
start := time.Now()
for {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a tight loop and will hang in non-preemptive implementations of the runtime. Even otherwise, it's something we should avoid.

michael-berlin added a commit to michael-berlin/vitess that referenced this pull request Apr 19, 2017
Addressed code review comment in vitessio#2599.
frouioui pushed a commit to planetscale/vitess that referenced this pull request Mar 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants