Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get upstream_connection_options configurable to do workaround for the flaky behavior of envoy #3214

Closed
tapih opened this issue Dec 17, 2020 · 3 comments
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/needs-triage Indicates that an issue needs to be triaged by a project contributor. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.

Comments

@tapih
Copy link

tapih commented Dec 17, 2020

I deployed a Golang HTTP client that makes HTTP requests periodically, about once every 10 seconds, to L7 endpoints which are exposed with envoy. This is done to check if the endpoints are alive as end-to-end testing.

While running it, I experienced the same error as reported in the issue about envoy.
This comment says that the error has something to do with keepalive, and this comment says that this issue is actually solved by configuring the tcp_keepalive under upstream_connection_options.

I think it's better to get the upstream_connection_options fields configurable because some developers seem to have faced this issue.
Any thoughts on this?

@tapih tapih added kind/feature Categorizes issue or PR as related to a new feature. lifecycle/needs-triage Indicates that an issue needs to be triaged by a project contributor. labels Dec 17, 2020
@youngnick
Copy link
Member

In general, we try not to just lift and copy Envoy structs into HTTPProxy, for three reasons:

  • Envoy sometimes changes the way fields or structures in the protobufs work, and we want to avoid needing to track that
  • Ideally, we want to make it so that you don't need to know Envoy to use Contour
  • Sometimes Envoy fields can have unexpected interactions in the reverse proxy case (as opposed to the mesh proxy case that is the default usecase for Envoy), and we want to try and make it hard for these unexpected interactions to trip you up. The best example of this for me is Requests can be misrouted due to HTTP/2 Connection Coalescing under certain scenarios #1493, where HTTP/2 connection reuse and the way we were using the HTTPConnectionManager combined to make an issue where you could TLS handshake with one domain, and then get routed with another, which was a problem for certificate authentication. This is not an issue when you are running Envoy as a service proxy sidecar, as you simple aren't using the proxy in the same way.

That said, we already set TCP keepalives for the listener itself (see #2633 and #2652 for greater configurability of it), but we don't currently for the upstream.

I think that adding TCP keepalive configurability to the upstreams seems to make sense, and we could probably reuse the same keepalive configuration struct for the listener as well, which would help with #2652.

binoue added a commit to binoue/contour that referenced this issue Dec 31, 2020
This change allows users to tune envoy's keepalive settings.

Fixes projectcontour#2652
Fixes projectcontour#3214

Signed-off-by: binoue <banji-inoue@cybozu.co.jp>
binoue added a commit to binoue/contour that referenced this issue Dec 31, 2020
This change allows users to tune envoy's keepalive settings.

Fixes projectcontour#2652
Fixes projectcontour#3214

Signed-off-by: binoue <banji-inoue@cybozu.co.jp>
binoue added a commit to binoue/contour that referenced this issue Dec 31, 2020
…ettings

This change allows users to tune envoy's keepalive settings.

Fixes projectcontour#2652
Fixes projectcontour#3214

Signed-off-by: binoue <banji-inoue@cybozu.co.jp>
binoue added a commit to binoue/contour that referenced this issue Dec 31, 2020
This change allows users to tune envoy's keepalive settings.

Fixes projectcontour#2652
Fixes projectcontour#3214

Signed-off-by: binoue <banji-inoue@cybozu.co.jp>
Copy link

The Contour project currently lacks enough contributors to adequately respond to all Issues.

This bot triages Issues according to the following rules:

  • After 60d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, the Issue is closed

You can:

  • Mark this Issue as fresh by commenting
  • Close this Issue
  • Offer to help out with triage

Please send feedback to the #contour channel in the Kubernetes Slack

@github-actions github-actions bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 23, 2023
Copy link

The Contour project currently lacks enough contributors to adequately respond to all Issues.

This bot triages Issues according to the following rules:

  • After 60d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, the Issue is closed

You can:

  • Mark this Issue as fresh by commenting
  • Close this Issue
  • Offer to help out with triage

Please send feedback to the #contour channel in the Kubernetes Slack

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Jan 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/needs-triage Indicates that an issue needs to be triaged by a project contributor. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants