-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
After upgrading Nomad to 1.4.2, services perpetually register in consul (/PUT to local consul agent) #15265
Comments
Thanks for reporting, @usovamaria
Yeah, ironically we were trying to fix a case (#14917) where Nomad would continuously re-register services into Nomad, but it seems it's broken elsewhere now. Can you post the |
Hi @shoenig. Yep, we've seen this issue so we were amazed by the There's a
|
Hey @usovamaria, so @jrasell and I both tried and failed to reproduce what you're seeing. To continue investigating we just shipped #15311 (trace logging around Consul service registrations) which should help pinpoint the underlying problem. If you have a chance, could you update one of the Nomad clients to 1.4.3 and enable trace logging.
Once we have the output from that we should be able to understand what's going on. |
Hi, @shoenig. We upgraded nomad to 1.4.3 and enabled tracing. It seems like nomad continuously tries to replace the lan_ipv6 field with wan_ipv6 with the same IP address, but task stays immutable:
|
I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues. |
Nomad version and environment details
Nomad
1.1.2
->1.4.2
Consul
1.9.5
Ubuntu
20.04
Environment configuration:
Servers: 3 nomad servers, consul servers (services register here) + consul servers with ACL.
Clients: we've installed nomad client and consul agent.
Issue
After upgrading Nomad from 1.1.2 to 1.4.2 we saw that the rate of services registration in consul (using
/PUT
to consul agent) significantly increased according to metrics and logs.We use metric
rate(consul_http_PUT_v1_agent_service_register_count)
in order to monitor consul agent API usage. Here is the graph based on this metric:Also we've noticed that consul logs amount related to registering service increased too, e.g.:
Nov 16 13:29:18 hostname consul[48446]: 2022-11-16T13:29:18.585+0300 [DEBUG] agent.http: Request finished: method=PUT url=/v1/agent/service/register from=[::1]:23594 latency=7.206663ms
.Log amount obviously strongly correlates to the number of service containers on a host.
In contrast, our production environment, where nomad was not upgraded, the
/PUT
rate to consul agent is normal (0-0.1 per minute).We do not experience performance issues on a current environment (with ~3K running containers) but we are going to update a production environment (with ~6K) and we can't predict an impact on consul and nomad performance. The only thing we noticed is that the number of goroutines on consul servers increased (~30K -> ~80K).
We assume that Nomad and Consul interaction during the process of service registration has changed.
Reproduction steps
1.1.2
, Consul1.9.5
./PUT
rate to a local consul agent (consul_http_PUT_v1_agent_service_register_count
) and log rate in DEBUG mode.1.4.2
according to official documentation.Expected Behaviour
Register rate of services remains the same or increases insignificantly after Nomad update
The text was updated successfully, but these errors were encountered: