-
Notifications
You must be signed in to change notification settings - Fork 271
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bias HTTP detection towards availability #894
Conversation
1b97e57 changed HTTP detection to bias towards correctness. Instead of performing a single read, the HTTP detection module reads data until a newline is encountered so that a dispositive determination can be made. Practically, this causes many non-HTTP protocols to fail with a connection timeout because a newline is never transmitted. This change reverts the HTTP detection logic to perform a single read and to use that data to make a determination. A minimum of 14B is required to handle a connection as HTTP.
d3ff1bd
to
bd05c96
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change reverts the HTTP detection logic to perform a single read
and to use that data to make a determination. A minimum of 14B is
required to handle a connection as HTTP.
As we discussed offline, it seems like a potential compromise (which would require a somewhat bigger code change) would be to make multiple reads, but fall back to transparent forwarding when detection times out, rather than failing the connection. This would let us avoid false negatives in the case where a single read doesn't return enough data to make a protocol determination, but would still bias for availability. I'm not sure whether or not this is necessary though. It seems pretty unlikely that we'll need multiple reads to detect HTTP.
We may want to do this in a follow-up regardless, but this change seems good to me for now.
This release changes HTTP protocol detection to prevent timeout errors in two ways: 1. HTTP detection no longer blocks until a newline is read. We've reverted to relying on a single read to make a determination. 2. Detection timeouts are no longer terminal. When a timeout is encountered, we continue forwarding the connection as an opaque TCP connection. These changes may lead to false-negatives--we may fail to detect some HTTP streams--but it should prevent many avoidable detection errors. This release also makes improvements for multicluster gateways, improving caching so that profile lookups are only performed once per target service. Diagnostic `stack_*` metrics have been moved so that they track underlying services, ignoring fail-fast. This should help us get better insights into services that are in failfast. Finally, the opencensus exporter has been improved to ensure that trace events are flushed if the trace buffer is not filled within a timeout. --- * actions: Update actions to use full SHAs (linkerd/linkerd2-proxy#885) * http: Parameterize normalize_ur::DefaultAuthority (linkerd/linkerd2-proxy#886) * http: Parameterize the HTTP server (linkerd/linkerd2-proxy#887) * opencensus: rewrite span exporter using async/await (linkerd/linkerd2-proxy#789) * Update http::Insert to use `Param` (linkerd/linkerd2-proxy#889) * Update crate dependencies (linkerd/linkerd2-proxy#892) * stack: Make the router fallible (linkerd/linkerd2-proxy#888) * Track stack metrics within failfast (linkerd/linkerd2-proxy#891) * outbound: Avoid building balancers when no concrete name (linkerd/linkerd2-proxy#890) * inbound: Cache HTTP gateways per destination (linkerd/linkerd2-proxy#893) * Reorganize the gateway crate (linkerd/linkerd2-proxy#897) * Bias HTTP detection towards availability (linkerd/linkerd2-proxy#894) * inbound: Use ALPN to determine transport header (linkerd/linkerd2-proxy#895) * detect: Return unknown protocol on detection timeout (linkerd/linkerd2-proxy#896) * Extract protocol detection into the gateway crate (linkerd/linkerd2-proxy#898)
This release changes HTTP protocol detection to prevent timeout errors in two ways: 1. HTTP detection no longer blocks until a newline is read. We've reverted to relying on a single read to make a determination. 2. Detection timeouts are no longer terminal. When a timeout is encountered, we continue forwarding the connection as an opaque TCP connection. These changes may lead to false-negatives--we may fail to detect some HTTP streams--but it should prevent many avoidable detection errors. This release also makes improvements for multicluster gateways, improving caching so that profile lookups are only performed once per target service. Diagnostic `stack_*` metrics have been moved so that they track underlying services, ignoring fail-fast. This should help us get better insights into services that are in failfast. Finally, the opencensus exporter has been improved to ensure that trace events are flushed if the trace buffer is not filled within a timeout. --- * actions: Update actions to use full SHAs (linkerd/linkerd2-proxy#885) * http: Parameterize normalize_ur::DefaultAuthority (linkerd/linkerd2-proxy#886) * http: Parameterize the HTTP server (linkerd/linkerd2-proxy#887) * opencensus: rewrite span exporter using async/await (linkerd/linkerd2-proxy#789) * Update http::Insert to use `Param` (linkerd/linkerd2-proxy#889) * Update crate dependencies (linkerd/linkerd2-proxy#892) * stack: Make the router fallible (linkerd/linkerd2-proxy#888) * Track stack metrics within failfast (linkerd/linkerd2-proxy#891) * outbound: Avoid building balancers when no concrete name (linkerd/linkerd2-proxy#890) * inbound: Cache HTTP gateways per destination (linkerd/linkerd2-proxy#893) * Reorganize the gateway crate (linkerd/linkerd2-proxy#897) * Bias HTTP detection towards availability (linkerd/linkerd2-proxy#894) * inbound: Use ALPN to determine transport header (linkerd/linkerd2-proxy#895) * detect: Return unknown protocol on detection timeout (linkerd/linkerd2-proxy#896) * Extract protocol detection into the gateway crate (linkerd/linkerd2-proxy#898)
This release changes HTTP protocol detection to prevent timeout errors in two ways: 1. HTTP detection no longer blocks until a newline is read. We've reverted to relying on a single read to make a determination. 2. Detection timeouts are no longer terminal. When a timeout is encountered, we continue forwarding the connection as an opaque TCP connection. These changes may lead to false-negatives--we may fail to detect some HTTP streams--but it should prevent many avoidable detection errors. This release also makes improvements for multicluster gateways, improving caching so that profile lookups are only performed once per target service. Diagnostic `stack_*` metrics have been moved so that they track underlying services, ignoring fail-fast. This should help us get better insights into services that are in failfast. Finally, the opencensus exporter has been improved to ensure that trace events are flushed if the trace buffer is not filled within a timeout. --- * actions: Update actions to use full SHAs (linkerd/linkerd2-proxy#885) * http: Parameterize normalize_ur::DefaultAuthority (linkerd/linkerd2-proxy#886) * http: Parameterize the HTTP server (linkerd/linkerd2-proxy#887) * opencensus: rewrite span exporter using async/await (linkerd/linkerd2-proxy#789) * Update http::Insert to use `Param` (linkerd/linkerd2-proxy#889) * Update crate dependencies (linkerd/linkerd2-proxy#892) * stack: Make the router fallible (linkerd/linkerd2-proxy#888) * Track stack metrics within failfast (linkerd/linkerd2-proxy#891) * outbound: Avoid building balancers when no concrete name (linkerd/linkerd2-proxy#890) * inbound: Cache HTTP gateways per destination (linkerd/linkerd2-proxy#893) * Reorganize the gateway crate (linkerd/linkerd2-proxy#897) * Bias HTTP detection towards availability (linkerd/linkerd2-proxy#894) * inbound: Use ALPN to determine transport header (linkerd/linkerd2-proxy#895) * detect: Return unknown protocol on detection timeout (linkerd/linkerd2-proxy#896) * Extract protocol detection into the gateway crate (linkerd/linkerd2-proxy#898) Signed-off-by: Jijeesh <jijeesh.ka@gmail.com>
This release changes HTTP protocol detection to prevent timeout errors in two ways: 1. HTTP detection no longer blocks until a newline is read. We've reverted to relying on a single read to make a determination. 2. Detection timeouts are no longer terminal. When a timeout is encountered, we continue forwarding the connection as an opaque TCP connection. These changes may lead to false-negatives--we may fail to detect some HTTP streams--but it should prevent many avoidable detection errors. This release also makes improvements for multicluster gateways, improving caching so that profile lookups are only performed once per target service. Diagnostic `stack_*` metrics have been moved so that they track underlying services, ignoring fail-fast. This should help us get better insights into services that are in failfast. Finally, the opencensus exporter has been improved to ensure that trace events are flushed if the trace buffer is not filled within a timeout. --- * actions: Update actions to use full SHAs (linkerd/linkerd2-proxy#885) * http: Parameterize normalize_ur::DefaultAuthority (linkerd/linkerd2-proxy#886) * http: Parameterize the HTTP server (linkerd/linkerd2-proxy#887) * opencensus: rewrite span exporter using async/await (linkerd/linkerd2-proxy#789) * Update http::Insert to use `Param` (linkerd/linkerd2-proxy#889) * Update crate dependencies (linkerd/linkerd2-proxy#892) * stack: Make the router fallible (linkerd/linkerd2-proxy#888) * Track stack metrics within failfast (linkerd/linkerd2-proxy#891) * outbound: Avoid building balancers when no concrete name (linkerd/linkerd2-proxy#890) * inbound: Cache HTTP gateways per destination (linkerd/linkerd2-proxy#893) * Reorganize the gateway crate (linkerd/linkerd2-proxy#897) * Bias HTTP detection towards availability (linkerd/linkerd2-proxy#894) * inbound: Use ALPN to determine transport header (linkerd/linkerd2-proxy#895) * detect: Return unknown protocol on detection timeout (linkerd/linkerd2-proxy#896) * Extract protocol detection into the gateway crate (linkerd/linkerd2-proxy#898) Signed-off-by: Jijeesh <jijeesh.ka@gmail.com>
1b97e57 changed HTTP detection to bias towards correctness. Instead of
performing a single read, the HTTP detection module reads data until a
newline is encountered so that a dispositive determination can be made.
Practically, this causes many non-HTTP protocols to fail with a
connection timeout because a newline is never transmitted.
This change reverts the HTTP detection logic to perform a single read
and to use that data to make a determination. A minimum of 14B is
required to handle a connection as HTTP.