-
-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gnatsd: -health and -rank options #435
Conversation
NB, The CI checks will fail because the corresponding glycerine/go-nats |
e5fffa0
to
a3c5adf
Compare
the PR for go-nats that will let you manually build this: https://github.com/nats-io/go-nats/pull/265 status update: |
427238f
to
7385177
Compare
The dependency, https://github.com/nats-io/go-nats/pull/265, is ready for review. Travis can work on this PR again once that one is merged. |
Status update: this is in good shape now. It doesn't have the nats-top part done yet, but the election and health monitoring agent are tested and polished. |
ff74b9c
to
a4e017e
Compare
options to control health monitoring. The InternalClient interface offers a general plugin interface for running internal clients within a gnatsd process. The -health flag to gnatsd starts an internal client that runs a leader election among the available gnatsd instances and publishes cluster membership changes to a set of cluster health topics. The -beat and -lease flags control how frequently health checks are run, and how long leader leases persist. The health agent can also be run standalone as healthcmd. See the main method in gnatsd/health/healthcmd. The -rank flag to gnatsd adds priority rank assignment from the command line. The lowest ranking gnatsd instance wins the lease on the current election. The election algorithm is described in gnatsd/health/ALGORITHM.md and is implemented in gnatsd/health/health.go. Fixes nats-io#433
Closing this due to age. |
We need this feature of health check |
Help me understand what you are trying to do and what problem this is solving? |
Sorry, it gets lost in my todos. That time I was testing nats-server helm chart for deployment on kubernetes without using nats operators. |
This is about half done. There's a bunch of polish (especially
around all the verbose and ugly logging the election
produces) and some tests are still left to do.
Still, one can see the basic architecture. And its that basic integration
architecture/approach that I wanted to get feedback on early. In
particular and specifically, if someone would vet the additions to server.go,
that's the integration point and the most likely place where I've
missed something important.
TODO:
[ ] get ServerRank actually configured, and see if ServerLocation() can be eliminated or finished out.
[ ] write up the leader election algorithm used here. It is ultra simple but still needs documenting.
[ ] write tests for the election
[ ] finish getting the health info nats-top is used to into a regular nats-topic
[ ] hook up that output to nats-top
eventual commit message, proposed:
Cluster health monitoring and
leader election and priority
assignment are made available
with the -health and -rank
flags to gnatsd.
The InternalClient interface offers
plugin capability for running
internal clients.
Allows nats-top to monitor the health
of the whole cluster.
Fixes #433 and nats-io/nats-top#31