-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Crash in ConnectionImpl::onMessageBeginBase #3337
Comments
This is not websocket-specific. I deleted the "use_websocket" line from the config and get the same result. |
|
Looking at a packet capture, an HTTP response was sent BEFORE the HTTP request for the health check. Basically as soon as the connection was established, a response was sent. |
Yeah we probably just need a check for this and then convert it to an invalid protocol exception. cc @alyssawilk |
FWIW, I have a repeater for this using #3338 and a corpus that local fuzzing discovered. |
gives
|
I'm also seeing this with the same stack trace. I see a read error in my trace-level log that isn't mentioned above, so figured I'd post my log as well:
EDIT: I should add that the C1 connection there is connected to a client cert service (i.e. one with this API), in case that helps track anything down. |
I've been looking at this for a bit, and it's not as straightforward as I thought. Also, I think the fuzz test failure is slightly different from the original issue. I will figure this out fully tomorrow. |
I've opened nodejs/http-parser#432 to fix what I think is an http-parser bug. I will also test the original websocket issue to see if it's the same issue. |
I just confirmed the original issue is the same as the fuzz test issue. I will wait to see what upstream says about my fix. |
Nice, thanks Matt! |
Yeah, thanks @mattklein123 for investigating and submitting the upstream fix. 👍 Should we poke around a bit more on envoy's side though? It seems unlikely that my super-simple golang HTTP server is responding with a CR/LF before the HTTP header. Could there be a reused connection that needs to flush out trailing CR/LF before starting the next request? |
@llchan yes, that's exactly what is happening. My upstream fix will flush CRL/LF. |
I guess my question is more like, how did it even get into that state where there's an extra CR/LF sitting around? Shouldn't the Content-Length of the previous response give us the right size to read off? Is there some streaming parser (e.g. JSON) that consumes the data but not the newline after it? |
I have no idea, I didn't dig that deeply. AFAICT the parser should be eating extra CR/LF between responses and that's what I fixed it to do. |
Now that the http-parser PR is merged, what's the game plan? Do we need to wait for a tagged http-parser release, or can we point at the latest master? |
@llchan it has not been merged. Please ping the PR there and try to get them to merge it. |
Derp, my bad, misread the notification. It got the second approval it was awaiting though, so hopefully they actually merge it soon. I'm seeing many crashes a day from this so just trying to keep things moving. |
Requires pulling upstream http-parser fix. Fixes #3337 Signed-off-by: Matt Klein <mklein@lyft.com>
Thanks for the persistence in getting this in! |
Description:
Envoy crashes with the supplied configuration file, and using tcpkali as a server.
I tested with Envoy trunk on macOS, using f95269d.
Repro steps:
Config:
Call Stack:
The text was updated successfully, but these errors were encountered: