Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use infinite restarts for xsnippet-api.service #61

Merged
merged 1 commit into from
Dec 23, 2024
Merged

Conversation

malor
Copy link
Member

@malor malor commented Dec 22, 2024

systemd defaults are fairly strict (<= 5 restarts within 10s) before it gives up on the service. And we would prefer the service to recover automatically after transient failures even if takes longer than that.

E.g. yesterday, I found that xsnippet-api had been down after the last machine reboot because it could not fetch the Auth0 key on start, but the same thing could happen with database connection as well.

systemd defaults are fairly strict (<= 5 restarts within 10s) before
it gives up on the service. And we would prefer the service to recover
automatically after transient failures even if takes longer than that.

E.g. yesterday, I found that xsnippet-api had been down after the last
machine reboot because it could not fetch the Auth0 key on start, but
the same thing could happen with database connection as well.
@malor malor requested a review from ikalnytskyi December 22, 2024 10:29
@@ -2,6 +2,7 @@
Description = XSnippet API
After = network.target network-online.target
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but the same thing could happen with database connection as well.

I'd suggest to add PostgreSQL to both Wants= and After=. So at least after the reboot we can preserve a proper ordering. Or maybe even Requires= if we want xsnippet-api not be started if PostgreSQL could not start.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that sounds good to me! I'm just not sure whether PostgreSQL unit actually waits until the process is ready to accept connections. So we might need infinite (or, at the very least, extra restarts) as defense in depth.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's linked with systemd, I'd expect it to report status "ready" once it's ready to accept connections.

Copy link
Member Author

@malor malor Dec 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, interesting! Didn't expect that! https://github.com/postgres/postgres/blob/578a7fe7b6f8484f6d7caa2fda288abb3fe87aa0/src/backend/postmaster/postmaster.c#L2317-L2325 confirms there is integration with systemd, indeed.

Let me add an explicit dependency on postgresql.service then.

@malor malor requested a review from ikalnytskyi December 22, 2024 21:44
@malor malor merged commit ae50993 into master Dec 23, 2024
4 checks passed
@malor malor deleted the infinite-restarts branch December 23, 2024 06:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants