Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(api) add a config readiness endpoint #8256

Closed
wants to merge 6 commits into from

Conversation

rainest
Copy link
Contributor

@rainest rainest commented Jan 3, 2022

Summary

Adds a config readiness endpoint. This is designed to allow the Ingress controller to detect when Kong has restarted and needs to receive configuration again. It should also be useful for load balancers that want to avoid sending requests through an instance before it is ready to process them.

Full changelog

  • Add a config readiness endpoint and include it on the status listen. When db="off", it returns a 200 if Kong has loaded or received a configuration and applied it, or 503 if not.
  • Compute hashes when loading configuration from the environment, to distinguish between user-supplied configuration and the default blank configuration.
  • Set the database mode to the strategy for Kong endpoint tests. It's unclear if this was the earlier intent, since the previous strategy effectively did nothing as best I can tell. Doing so is necessary for the new endpoint tests to function.
  • Use the config hash when updating the stream configuration. Unclear if this is necessary, since we presumably can't check its SHM value until the stream and HTTP SHMs are unified (the admin API only looks at the HTTP subsystem SHM). We send it though, so may as well 🤷

kong/api/routes/health.lua Outdated Show resolved Hide resolved
@mflendrich
Copy link

blocked on #8224

@RobSerafini
Copy link
Contributor

@mflendrich - I believe LMDB will not ship in 2.8 as the default db_cache, and would remove the block you raised. Can you confirm?

@rainest
Copy link
Contributor Author

rainest commented Feb 9, 2022

@bungle assuming Rob's comment is correct, can you or another Gateway team member review this for the current DB-less implementation? The external interface would be consistent regardless of the backend, so I'm not concerned that we'd need a breaking change to that in a future version.

kong/api/routes/health.lua Outdated Show resolved Hide resolved
end
-- unintuitively, "true" is unitialized. we do always initialize the shdict key
-- after a config loads, this returns the hash string
if not declarative.has_config() then
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: does loading the config mean that the router and run loops have been updated to use the config? Is there a delay? Could the user be surprised by seeing 404s when this endpoint returns a 200?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It means declarative.load_into_cache() has completed. If you're running strict consistency, the router and such will be updated since they'll rebuild at request time. If eventual, it's rebuilt whenever that fires (maybe? not sure if that system behaves differently in DB-less).

For the controller's purposes that's fine, since we mainly just care about knowing when we need to push it again.

For general purpose config availability, I don't know if there's a good approach for knowing when the config is available, since it's built per-worker--I can set an additional SHM key, but it won't be guaranteed that all workers have rebuilt. Is there anything we can use to inform us when all workers have completed processing of the declarative flip_config event?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

c3fef049d868fd0d1dc98847ea8db6b090e6561a sorta does this, but it only means that at least one worker has rebuilt using the new config.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there is any plumbing for what I'm requesting, is there @bungle?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I confirm that what @rainest says is correct: his flag indicates that the declarative config might have been loaded into the node's memory, but it can take a while until all the nodes get their routers (and plugins iterators) updated. With strict consistency set, requests "will wait" if a Router is not up to date so for all effects and purposes the new exposed flag it will indicate that "Kong will use the new config, even if it means waiting".

With eventual consistency, the gateway itself "does not care" about exact availability. If something else cares, then I think the answer to the router and run loops have been updated to use the config can't be a single bit. Consider that some customers update their declarative configurations very rapidly. Workers are in config A. Config B arrives, gets processed by 1 worker, which sets the flag. Workers gradually load config B. While this is happening, Config C arrives. And so on. In that scenario with frequent updates, a single bit will always be false - a number of workers will always be out of date.

Options:

  • Change the question to is the router and and run loops updated for this specific config (or a more recent one)?. We could answer a bit for that one.
  • Change the question to how many workers are updated to this specific config. The answer to that one would be a number, or a couple numbers (2 out of 16).
  • Change the question to which config does every worker have. The answer would be a list of n configuration identifiers. Probably their hashes.

I don't think we have any of the plumbing for that at the moment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have something that provides the worker counts/a list of their PIDs? It's more complicated, but it should be viable to restructure the value of DECLARATIVE_CONFIG_READY_KEY from a bit to N bits, and then in the endpoint loop over N to confirm that all are set.

We're not really concerned with latest config, just that configuration is not null, so the update stampede case isn't a problem. You just always set your worker entry to true on update (there's probably some way to make this only happen once using local worker memory to avoid the unnecessary SHM locks after the first update--is there a Lua equivalent of https://pkg.go.dev/sync#Once?) and don't worry what specific hash you've loaded.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ed5d5d736 updates the previous system to add a worker PID onto the end, so we set and then check a key per worker, along with a better test to confirm that it works as expected. We can't test actual startup conditions easily (at all?) AFAIK, but most of the moving parts are in the bit where workers mark their status.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not the SME here so I'll defer the review to someone within Gateway team.
Health-endpoints are invoked by schedulers, orchestrators, load-balancers, external health-checkers, etc. Meaning, they should be very performant. The shm:gets acquire a lock across all workers, which can drastically affect performance.

kong/api/routes/health.lua Outdated Show resolved Hide resolved
@rainest rainest force-pushed the feat/config-ready branch 3 times, most recently from 36db71c to ba2980c Compare February 10, 2022 23:56
@rainest rainest force-pushed the feat/config-ready branch 7 times, most recently from 53da411 to f0d5103 Compare February 11, 2022 01:04
@hbagdi
Copy link
Member

hbagdi commented Feb 18, 2022

I was about to say this approach looks good but then I realized another issue, maybe an irrelevant corner case.

Adds a config readiness endpoint. This is designed to Kong/kubernetes-ingress-controller#2107 and needs to receive configuration again.

If a pod restarts for any reason, Kong will load the configuration that it has cached on disk, and this endpoint will return a 200. I think this doesn't solve the problem you originally set out to solve.
One could argue that all pods startup from a clean state and hence won't have any cached state - I do not find that to be true in practice.

It should also be useful for load balancers that want to avoid sending requests through an instance before it is ready to process them.

This is a fine use-case for this patch and I think we should accept it.

Now, how to solve the KIC issue? Thanks to #8214, the Admin API / response now contains the hash of the configuration. KIC can monitor the hash of all connected data-plane nodes, and send updates to nodes whose hash doesn't match up with what it expects it to be - I'm not sure how KIC would determine the hash it should expect from the data-plane node.

@rainest
Copy link
Contributor Author

rainest commented Feb 18, 2022

If a pod restarts for any reason, Kong will load the configuration that it has cached on disk, and this endpoint will return a 200. I think this doesn't solve the problem you originally set out to solve.
One could argue that all pods startup from a clean state and hence won't have any cached state - I do not find that to be true in practice.

We do want this to return a 200 if config is loaded from disk, config is available then. Is this specific to hybrid data planes or should it work on full DB-less also? Container restarts do not wipe out the tempdir we use for the prefix, so if it does get saved there, it should be reloaded on restart. In practice, if we assassinate a DB-less container, its replacement comes online with an empty configuration.

Now, how to solve the KIC issue? Thanks to #8214, the Admin API / response now contains the hash of the configuration. KIC can monitor the hash of all connected data-plane nodes, and send updates to nodes whose hash doesn't match up with what it expects it to be - I'm not sure how KIC would determine the hash it should expect from the data-plane node.

This is part of why I'd have preferred to use this to solve that issue: using the configuration hash ends up being a leaky abstraction where you need to reimplement Kong internals outside to fully understand what's going on. We can sorta get partway there by comparing against the known "0"*32 unset value, but if that changes, it either breaks or you need version-dependent behavior downstream. The yes/no status code API response is an easier contract to maintain than the specifics of the hashing system.

@kikito
Copy link
Member

kikito commented Feb 21, 2022

@rainest :

Do we have something that provides the worker counts/a list of their PIDs?

Yes, in the / admin API provides the list of worker pids inside pids.workers. It is set here:

https://github.com/Kong/kong/blob/master/kong/api/routes/kong.lua#L71-L87

@dndx
Copy link
Member

dndx commented Feb 23, 2022

@rainest The functionality of this PR seems to be duplicate to #8214. Do we still need it since #8214 is already merged?

@rainest rainest force-pushed the feat/config-ready branch 5 times, most recently from a74a5f0 to ed5d5d7 Compare February 23, 2022 23:41
@hbagdi
Copy link
Member

hbagdi commented Feb 24, 2022

Is this specific to hybrid data planes or should it work on full DB-less also?

I'm not sure. @bungle or @dndx Can either of you confirm this?

@rainest The functionality of this PR seems to be duplicate to #8214. Do we still need it since #8214 is already merged?

While they try to solve the same problem, their use is quite different.
When Kong is fronted by a load-balancer, the load-balancer must consider Kong healthy only if Kong has some configuration in it. Marking a running Kong without any configuration is as good as a dead Kong - the client will get a 404 instead of a 503.
I think it makes a lot of sense to accept this patch and give the users an option to opt-in into a more reliable source of health.

@dndx
Copy link
Member

dndx commented Feb 24, 2022

@hbagdi That is one possibility, but wouldn't the user actually wants to check for the reachability of backend services in this case? Just having any config loaded is a pretty weak guarantee that Kong is actually "healthy", there are a lot of things that could still gone wrong.

I am considering the original problem @rainest needs to solve within this PR's description which is KIC's config readiness detection, and it seems that #8214 already solves it. Maybe we should keep the problem scope confined for now.

@hbagdi
Copy link
Member

hbagdi commented Feb 24, 2022

@hbagdi That is one possibility, but wouldn't the user actually wants to check for the reachability of backend services in this case?

They could and they are free to do that and they should do that as an end-to-end monitoring check.
I believe that a user's desire to keep tabs on the health of Kong itself is a fair ask and it should be possible to know about the health of Kong without relying on any external services.
We already do have health endpoints, so I don't think this is new territory. What this endpoint provides is a more accurate health check than something like /status that responds as long as Nginx workers are around.

Just having any config loaded is a pretty weak guarantee that Kong is actually "healthy", there are a lot of things that could still gone wrong.

It is a stronger guarantee than not having any configuration at all.
And yes, loads can still go wrong, this is really about providing as much accurate health status as possible.

@rainest
Copy link
Contributor Author

rainest commented Feb 24, 2022

Is this specific to hybrid data planes or should it work on full DB-less also?

It was originally designed for full DB-less. For this they're functionally the same since the code to actually load the config is shared, and only the method for receiving new configs changes.

This is handled differently from #8124 for two reasons:

  • feat(api) add a config readiness endpoint #8256 (comment) from earlier. External clients ideally should not make decisions based on internal state values, because the meaning of those may change and they may not provide a complete answer to the question--as seen here a non-initial has value does not imply that workers have loaded it.
  • LB readiness usually will not interpret anything beyond HTTP statuses. They cannot parse the status output or make decisions based on the hash value. This was intended for use by both KIC and naive readiness checkers.

I concur with Harry that this isn't a perfect metric for readiness (there is no such thing), but it's taking us past the /status response (always a 200 unless Kong is very broken) to further confirm that we're not in another state we know is not viable for routing traffic (if you've never received any configuration, you will not route anything).

Travis Raines and others added 6 commits March 8, 2022 12:43
Add a config readiness endpoint. When db="off", it returns a 200 if Kong
has loaded or received a configuration and applied it, or 503 if not.

Compute hashes when loading configuration from the environment, to
distinguish between user-supplied configuration and the default blank
configuration.

Test the config readiness endpoint. This requires actually using the
database strategy for Kong route tests and adding a utility server that
allows tests to directly set SHM values, to circumvent the lack of
normal config load behavior in test environments.
@rainest
Copy link
Contributor Author

rainest commented Mar 15, 2022

@kikito @dndx checking on this, it hasn't received a review update in a while. Is there anything further that'd need to change to merge it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants