-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: use Envoy's default for validate_clusters to fix breaking routes when some backend clusters don't exist #21587
Conversation
66b128d
to
a761f2a
Compare
a761f2a
to
631d61f
Compare
1e0bec9
to
12f4daf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initial review LGTM! Once we introduce a flag I'm guessing we'll want to have a golden smoke test to prove it out but the general shape of these changes makes sense to me 👍🏻
3fd3227
to
ab15bc9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM, just a few small comments! Approved to unblock
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, one more question @ndhanushkodi , though happy to do this in a follow-up PR: I'm assuming we need some public docs updates as well? Just realized these changes are exlusively go/proto docs.
@zalimeni I originally thought to add docs as a followup but it was easy enough so I just went ahead and added here. |
5b1b3ef
to
0127607
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion to match PR 9699.
Approving on behalf of consul-docs.
`ValidateClusters is false by default and configures whether Envoy proxies will validate clusters in a route. If | ||
set to true and any clusters in the route do not exist, the route table will not load. If set to false, the | ||
route table will load and routing to a non-existent cluster will result in a 404. See | ||
[Envoy docs](https://www.envoyproxy.io/docs/envoy/latest/api-v3/config/route/v3/route.proto#envoy-v3-api-field-config-route-v3-routeconfiguration-validate-clusters) | ||
for more details. `, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
`ValidateClusters is false by default and configures whether Envoy proxies will validate clusters in a route. If | |
set to true and any clusters in the route do not exist, the route table will not load. If set to false, the | |
route table will load and routing to a non-existent cluster will result in a 404. See | |
[Envoy docs](https://www.envoyproxy.io/docs/envoy/latest/api-v3/config/route/v3/route.proto#envoy-v3-api-field-config-route-v3-routeconfiguration-validate-clusters) | |
for more details. `, | |
`Controls whether the clusters the route table refers to are validated. The default value is false. When set to false and a route refers to a cluster that does not exist, the route table loads and routing to a non-existent cluster results in a 404. When set to true and the route is set to a cluster that do not exist, the route table will not load. For more information, refer to | |
[HTTP route configuration in the Envoy docs](https://www.envoyproxy.io/docs/envoy/latest/api-v3/config/route/v3/route.proto#envoy-v3-api-field-config-route-v3-routeconfiguration-validate-clusters). `, |
Copying my suggestion from the other PR
0127607
to
0a51677
Compare
0a51677
to
16d59ad
Compare
… when some backend clusters don't exist (#21587)
Description
The validate_clusters option in Envoy's route configuration says:
"An optional boolean that specifies whether the clusters that the route table refers to will be validated by the cluster manager. If set to true and a route refers to a non-existent cluster, the route table will not load. If set to false and a route refers to a non-existent cluster, the route table will load and the router filter will return a 404 if the route is selected at runtime. This setting defaults to true if the route table is statically defined via the route_config option. This setting default to false if the route table is loaded dynamically via the rds option. Users may wish to override the default behavior in certain cases (for example when using CDS with a static route table)."
We are setting it dynamically via RDS, but overriding the default value to set it explicitly to true. This means when a cluster that the route is supposed to point to doesn't exist, the route can fail to route to any of its backends. This case can be triggered if you have a router -> resolver where the resolver has backends on different peers/wan federated backends, and you add a route to a backend that doesn't exist. The non-existent backend causes the existing backends to fail. I was not able to trigger this case in a single cluster setup, but with a peered backend it can be triggered.
Because, the traffic doesn't just blackhole, but rather returns a 503, this actually seems to be the desired behavior, rather than making all other routing paths within that route fail due to a missing cluster. This is similar to the conclusion that was reached within the Jira ticket.
This PR removes the code that overrides the default value of this validate_clusters option.
Testing & Reproduction steps
Links
PR Checklist