-
Notifications
You must be signed in to change notification settings - Fork 4.6k
Failing to delete the stream from the activeStreams
map leading to REFUSED_STREAM
errors.
#8504
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: alanprot <alanprot@gmail.com>
ccc4735
to
d09c515
Compare
activeStreams
map leading to "REFUSED_STREAM" errors.activeStreams
map leading to REFUSED_STREAM
errors.
Hi @alanprot, thanks for sharing the test to repro. I haven't debugged the test, but wanted to ask if you know why the check added causes stream deletion to be skipped? From a look quick look at the places grpc-go/internal/transport/http2_server.go Lines 1332 to 1343 in 57b69b4
grpc-go/internal/transport/http2_server.go Lines 1350 to 1360 in 57b69b4
In both places, Edit: It seems like the |
Hi @arjan-bal As far i could see the problem is we add set the clean and add the hdr on the controlbuff at grpc-go/internal/transport/http2_server.go Line 1338 in 19c720f
And we get to: grpc-go/internal/transport/controlbuf.go Lines 692 to 696 in 19c720f
And at this point, the stream is not on the grpc-go/internal/transport/controlbuf.go Lines 947 to 950 in 19c720f
And i think this stream never become added back on the So it seems that it only happens when the ServerSide is sending tons of data and waiting for quota? |
Thank you @alanprot, I think I understand the sequence of events leading to this.
The simplest fix is to remove the guard block added in @dfawley what do you think? |
That makes sense to me. I can't think of any other way to do this. Does the transport properly handle the rest of this situation, especially: dropping the queued data frames when the RST_STREAM is sent (or shortly thereafter)? |
Also maybe we wanna consider decrementing this metric only if the stream was present on the activeStreams map? grpc-go/internal/transport/http2_server.go Lines 1307 to 1323 in 18ee309
|
All the pending Data and Header frames are dropped in cleanupStreamHandler.
The check for max concurrent streams depends on the size of the activeStreams map. Without timeout: With timeout: |
@alanprot are you interested in raising a PR with the fix and a unit test for catching regressions? |
@arjan-bal I can definitely do that, but I’m out until Tuesday, so I’d only be able to pick it up then. :) |
Description
Hi,
After upgrading
go-grpc
, we started see the following error:Upon investigation, it appears that in some scenarios, we fail to properly delete streams from the
activeStreams
map, resulting in this error being wrongly returned.I added a test case that seems to trigger the problem:
PS: To test the behavior i added a "ActiveStreamTracker" hook -> im not sure this is the best way but i wanted just to show the problem
The issue seems to have been introduced in the following PR: #8071
Specifically, the check added here:
grpc-go/internal/transport/http2_server.go
Lines 1357 to 1359 in 57b69b4
If this conditional is removed, all streams are correctly cleaned up from the
activeStreams
map, and the test passes without leaks.