-
Notifications
You must be signed in to change notification settings - Fork 814
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The consul.catalog.nodes_up should reflect the # of nodes actually up #2018
Conversation
… so it is now checking the serfHealth for each service node. Also track the actual status of the nodes in the service.
Hi Mike, Thanks for the continued PRs on the agent! It looks like this PR might have caused the mock tests to start failing for this check.
I think the API key failure maybe a flakey test. @remh thoughts? |
Actually, I think the problem with this one is that I didn't update the mock. Fixing that now. |
@irabinovitch okay, now I think it is due to the problem with the unit tests. :/ |
@irabinovitch Can you help me out here? It looks like the unit tests are failing for something totally unrelated to my changes., |
Thanks for looking into this @mtougeron, the tests that were failing were flaky, I've restarted them and they pass now. |
Thanks @olivielpeau @mtougeron. Apologies for the delay in responding, been on the road a bit. |
Ya! Thanks! Now to get it merged. :) |
@talwai can you look at this one ? |
Looks good @mtougeron , thanks once again. I'm going to add some cosmetic changes on top of your work and we can merge it in for our next bugfix release |
Cool, sounds good! Thanks. |
These changes were merged via #2130 |
The
consul.catalog.nodes_up
should reflect the # of nodes actually up so it is now checking the serfHealth for each service node. Technically, I think theconsul.catalog.nodes_up
should reflect the # of nodes passing the health checks but for backwards compatibility I left it with just the nodes that are running that are part of the service.I also added 3 new stats:
consul.catalog.nodes_passing
,consul.catalog.nodes_warning
, &consul.catalog.nodes_critical
so that you can trend the actual # of nodes in each state for the service.Future enhancement planned: Adjust the service checks to use the
/v1/health/service/<service>
endpoint instead of/v1/health/state/<state>
since/v1/health/state/<state>
only returns results for services with aCheck
defined. This means that services that are based on if the node is up/down won't have proper checks from the agent.