Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature] Provide healthcheck endpoints for uptime monitoring #2707

Closed
NoraCodes opened this issue Mar 1, 2024 · 3 comments · Fixed by #2783
Closed

[feature] Provide healthcheck endpoints for uptime monitoring #2707

NoraCodes opened this issue Mar 1, 2024 · 3 comments · Fixed by #2783
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@NoraCodes
Copy link
Contributor

NoraCodes commented Mar 1, 2024

Is your feature request related to a problem ?

I am using https://nine9s.cloud to monitor my GoToSocial instances. Like many uptime monitoring services, they provide the option to use HEAD instead of GET requests to monitor the up-ness of the service, as they use less bandwidth. However, GoToSocial emits 405 Method Not Allowed for HEAD requests in most cases. (Per #2055).

Describe the solution you'd like.

Ideally, GoToSocial would return 200 OK for HEAD requests, even if only to / specifically.

A dedicated "upness" URL, or a documented API route that has a minimal impact and requires no database requests, would be equally useful. I wasn't able to find one in the documentation.

Additional context.

No response

@NoraCodes NoraCodes added the enhancement New feature or request label Mar 1, 2024
@daenney
Copy link
Member

daenney commented Mar 6, 2024

I think we should implement this as a separate endpoint(s). The "cloud native" ecosystem seems to be standardising on /livez and /readyz, with roughly the following meaning:

  • /livez is this process alive
  • /readyz is this thing in a state where it should be receiving traffic

(The z suffixes are just to reduce the chance of naming collisions with an actual API route and have no meaning otherwise).

/livez could be handy since it ought to fail to respond if GoToSocial is somehow broken (like deadlocked) but would otherwise return that it's up. I think that would also solve this request, and we could have that respond to both HEAD and GET. It shouldn't return anything more than an HTTP status code, so even on just a GET it wouldn't result in any data being transferred.

On the /readyz endpoint we could perform a simple and very cheap DB query, like fetching the local instance account, to check we're actually in a functioning state (i.e we can handle incoming federation requests and perform our own).

Having those things would also be useful for Docker (using the HEALTHCHECK), as that results in some additionally useful output in the Docker CLI (the (healthy) between brackets):

[root@eu1 ~]# docker ps
CONTAINER ID   IMAGE                          COMMAND                  CREATED       STATUS                 PORTS     NAMES
xxxxxxxxxxxx   xxxxxxxxxxxx/xxxxxxx:v1.0.0   "python -m xxxxxxx.a…"   5 weeks ago   Up 5 weeks (healthy)             xxxxxxx

For folks using an orchestration platform this may also prove useful as those often have facilities to perform these types of checks.

@tsmethurst
Copy link
Contributor

It's much easier for us to implement a health check endpoint than change our router to serve HEAD requests. @NoraCodes would you be happy with changing this FR to request that instead? If so, I think this is something we can easily add for 0.15.0...

@NoraCodes NoraCodes changed the title [feature] Serve HTTP HEAD Requests on / [feature] Provide healthcheck endpoints for uptime monitoring Mar 13, 2024
@NoraCodes
Copy link
Contributor Author

Done! I would be perfectly happy with that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants