-
Notifications
You must be signed in to change notification settings - Fork 547
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Data and SEND_SHUTDOWN won't get sent to the peer #2063
Comments
I see the data queued onto the server (server side) but it's never put into a packet. This likely means the stream is blocked, but I don't see all the events for the stream back to when it was created. We should have at least one of these events: QuicTraceEvent(
StreamOutFlowBlocked,
"[strm][%p] Send Blocked Flags: %hhu",
Stream,
Stream->OutFlowBlockedReasons); Do you still have all the logs? |
msquic.1508.log I still have them all, from the whole run, but it's 10s of GBs of data. |
Ok. So the stream isn't blocked, but the connection is, on FC:
Notice the |
You probably mean in general on the connection level? Not for this particular stream? |
Correct. |
So I assume that msquic expect the app to drain all data from all streams before it shuts them down and closes them? Even if we abort reads? |
No, if you abort the stream receive path, then you don't have to drain the receives. |
We're aborting reads from stream Dispose in case we didn't reach RECEIVE event with FIN flag: So we shouldn't have any data pending on the connection. We also don't have any pending (undisposed) streams on the connection AFAIK. Do you know of any other scenario/behavior that might cause this? |
I'm double checking the code and will get back to you shortly. |
Well, uggg. It looks like you found another bug around receive abort. We're not correctly updating the connection-wide FC if the app aborts the stream receive without completing the receive first. I think I can quickly have a fix for you to test though. |
Describe the bug
This is fairly reproducible in our HTTP/3 stress tests. At around 20 minutes mark (~300 000 requests), a request will fail with timeout. It doesn't seem like we do something wrong, but it still might be our fault. I also collected dump from about 5 seconds after the requests gets stuck and haven't seen anything suspicious in it.
Attached log:
client stream 0x7f18f4010150
server stream 0x7f6b80027090
Request gets send around 02:36:30 and then at 02:36:40 it gets cancelled on a timeout so the streams get shutdown and closed eventually.
This is the filtered msquic.log (full msquic.log):
Also I'm running debug build of msquic.
Affected OS
Additional OS information
Arch Linux kernel 5.14
But seeing the same timeout in our stress pipeline - Debian bullseye docker image - as well.
MsQuic version
main
Steps taken to reproduce bug
Run dotnet/runtime HTTP/3 stress tests for about 20 minutes and you'll see HTTP request failing with timeout.
Expected behavior
Client to receive the data and SEND_SHUTDOWN from the server.
Actual outcome
It seems like server tries to send the data and shutdown its sending side, but it never gets reported on the client side (no RECEIVE event, no PEER_SEND_SHUTDOWN).
Additional details
No response
The text was updated successfully, but these errors were encountered: