-
Notifications
You must be signed in to change notification settings - Fork 114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Docker state updaters #658
Comments
this should work: unless you're proposing a more detailed view? |
@wileyj this is good, but doesn't integrate with docker itself. what i'm aiming for is to have the container keep the docker daemon updated on the node's status -- where it should normally be 'healthy'. if the node starts having issues but hasn't fatally crashed, there's no way for the docker daemon to work out whether it should attempt to reboot the node or whatnot. the node should keep its docker status updated even while it's running, because even though the container can throw errors (say, it can't connect to the Bitcoin node it's using), these aren't shown in docker. to docker, the container seems happy and fine when in fact it's not and alerts may need to happen and Docker should attempt to autorepair it. |
|
there is this: https://docs.docker.com/engine/reference/builder/#healthcheck but i'm not convinced this should be added to the API due to how the API works (a flapping API would be a very very bad thing). You're welcome to PR a change with this to elicit more comments, but it'd probably be a tough sell. |
@SyAsteria feel free to make this change to your Dockerfile, but I'm going to close this for this repo for now. |
I actually think this is a good idea as long as the health check is properly configured. However, there is a caveat to consider which may lower the effectiveness of the health check I talk about at the bottom. If the health check fails to resolve N times with an interval of X seconds, the container being restarted from the docker daemon wouldn't result in any further loss of availability since there already was none. And by doing so it would help bubble the problem up to the surface. From what we've seen usually when the API is down but doesn't crash, it stays down. We've been caught by this issue multiple times and a healthcheck would be the right solution to that problem. Additionally, a HEALTHCHECK directive in the dockerfile will not interfere with anyone that runs the container in k8s; k8s users will still need to create a k8s health check for the API otherwise there won't be one enabled at all. The caveat is that the API can take a long time to boot depending on how it's configured. For example, in mainnet the API can import a TSV file containing all the event data of the blockchain, as well as having to import BNS data afterwards. This can take upwards of 2 hours in mainnet because of the size of the blockchain, and will increase as the chain grows unless @zone117x figures out a faster way of importing the data in the future. So the health check configured in the dockerfile file would have to account for that by setting a high Alternatively @SyAsteria, if you're running the API directly in Docker the latest Docker client should have health check options available. So even without the HEALTHCHECK directive in the dockerfile, you can add your own health check on the fly for any container when executing |
Is your feature request related to a problem? Please describe.
I'd like the API to report more detailed status updates to the Docker engine when running. Docker has a built-in status display so the daemon running them knows their current health. A container may indeed be running, but perhaps is not healthy, and needs rebooting. There is no way for the current implementation to report to the daemon if it runs into a problem and needs rebooting. The only way a container would be able to force itself to reboot currently is to crash itself on purpose to attempt to get the Docker daemon to reboot it.
This is kinda stupid, so we should add state updates of both 'healthy' and 'unhealthy' to the containers, allowing for more reliable management and spinups.
The text was updated successfully, but these errors were encountered: