-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Segfault in 1.8.x and 1.7.5 with Consul Connect #8430
Comments
edit: disregard! edited into main comment |
The strange log message is reported as a bug in #7512, and fixed in master. I'll get the fix backported into a 1.8.x release. The panic looks like it is on dereferencing |
From what I can tell it's related to this bug in envoy: envoyproxy/envoy#9682 We started setting the
Until envoyproxy/envoy#9682 is fixed we'll have to go back to the less efficient configuration where we omitted the |
…ating envoy bootstrap config When consul is restarted and envoy that had already sent DiscoveryRequests to the previous consul process sends a request to the new process it doesn't respect the setting and never populates DiscoveryRequest.Node for the life of the new consul process due to this bug: envoyproxy/envoy#9682 Fixes #8430
…ating envoy bootstrap config (#8440) When consul is restarted and an envoy that had already sent DiscoveryRequests to the previous consul process sends a request to the new process it doesn't respect the setting and never populates DiscoveryRequest.Node for the life of the new consul process due to this bug: envoyproxy/envoy#9682 Fixes #8430
…ating envoy bootstrap config (#8440) When consul is restarted and an envoy that had already sent DiscoveryRequests to the previous consul process sends a request to the new process it doesn't respect the setting and never populates DiscoveryRequest.Node for the life of the new consul process due to this bug: envoyproxy/envoy#9682 Fixes #8430
I've just upgraded a test Kubernetes cluster to 1.8.1 from 1.7.1 and my agent instances are crashing over and over with a segfault
I'm not sure what call is triggering this, there are a number of services in the cluster using Consul Connect but without the consul agents running its hard to isolate the cause of this crash as... well everything is crashing and restarting as there's no agent available.
edit:
Managed to pin this down a bit more.
I can reliably reproduce the crash when there is any Consul Connect enabled service running (e.g. an Envoy proxy operating as a sidecar pulling config from the local agent via xDS) and the agent restarts.
I'm using consul-helm to setup my agents and the connect injector to setup Envoy sidecars.
I've tested this with Consul 1.7.1, 1.7.3, 1.7.4, 1.7.5 and 1.8.1 with the latest supported Envoy version for each (so 1.13.1 and 1.14.2, specifically the
envoyproxy/envoy-alpine
image variant)Only 1.7.5 and 1.8.1 crash when the agent is restarted. 1.7.4 is fine.
The only commit I can see between 1.7.4 and 1.7.5 that looks relevant is #8266
On Consul versions that don't crash, and on versions that do crash which are already running when the Connect service starts, I do see a log entry. Not sure if this is relevant though, the consul-helm changelog has a note about an incorrect health check that will fail on 1.8.0, but i'm using consul-k8s:0.18.0 for all these tests.
The text was updated successfully, but these errors were encountered: