You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We were seeing performance issues in one of our mesh clusters that has about 5k participants, when doing deployments. The registration time would spike to 10+ seconds, which would cause errors and delays downstream. This corresponded to the raft/FSM thread being saturated for potentially minutes at a time. We seemed to have pretty low update rates overall, so this was surprising.
I grabbed a pprof to look in to it a bit more, but unfortunately a single routine on a busy server didn't give the greatest information. Still, reflection stood out here and it gave us somewhere to look.
Reproduction Steps
To confirm what we were seeing here, I wrote a quick test case to run a snapshot of our data through this code path.
Overview of the Issue
We were seeing performance issues in one of our mesh clusters that has about 5k participants, when doing deployments. The registration time would spike to 10+ seconds, which would cause errors and delays downstream. This corresponded to the raft/FSM thread being saturated for potentially minutes at a time. We seemed to have pretty low update rates overall, so this was surprising.
I grabbed a pprof to look in to it a bit more, but unfortunately a single routine on a busy server didn't give the greatest information. Still, reflection stood out here and it gave us somewhere to look.
Reproduction Steps
To confirm what we were seeing here, I wrote a quick test case to run a snapshot of our data through this code path.
Which confirmed that most of the registration time is spent in reflection.
Consul info for both Client and Server
Consul 1.14
Operating system and Environment details
Linux, amd64.
The text was updated successfully, but these errors were encountered: