Enable better Envoy debugging #2792
Labels
area/operational
Issues or PRs about making Contour easier to operate as a production service.
lifecycle/stale
Denotes an issue or PR has remained open with no activity and has become stale.
Please describe the problem you have
Original slack thread
The original motivation for this is to be able to trace back from a simple
503
response to the Envoy that served it, since generally it's logs contain a treasure trove of information about what happened.In the bad response, there is typically a
Server:[envoy]
and the slack thread is about whether this is configurable. @jpeach dug up this in the Envoy API: https://www.envoyproxy.io/docs/envoy/latest/api-v3/extensions/filters/network/http_connection_manager/v3/http_connection_manager.proto#envoy-v3-api-field-extensions-filters-network-http-connection-manager-v3-httpconnectionmanager-server-nameIdeally this would serve the Envoy pod name, so that its logs can be quickly accessed.
It would be next-level cool if this could contain checksum information for the configuration the Envoy is programmed with that could be cross-correlated with the Contour logs to get the full Envoy configuration at the point of failure 😇
I'm thinking something like a
User-Agent
string with key-value strings.Server=[envoy-zxhkjh; EDS={hash}; RDS={hash}; ...]
Generally Contour's logs are painfully terse today, and the above would be extremely helpful.
The text was updated successfully, but these errors were encountered: