-
Notifications
You must be signed in to change notification settings - Fork 14.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Scheduler and triggerer crashes in daemon mode when statsd metrics are enabled #35181
Conversation
I see no big problem with it and I see why it could be needed, but I would love someone else to take a look as well @Taragolis @uranusjr @ferruzzi ? Seems to me that this could possibly be solved in a more generic way - I do recall similar open-telemetry problem year ago or so (that has since been fixed), so I guess this is really only a problem with statsd, and maybe the solution here is "good enough" ? |
This doesn't flush the existing metrics when you create a new instance? For example, graphs still have their data from before the flush?? |
I believe there are virtually no metrics before. Daemonisation happens right at the very start of any component that enables it, and I think what this problem mainly solves is that it seems Statsd integration when imported and initialized saves some states and possibly starts some thread that then are "lost" when daemonizing. The deamonisation does a few things:
I think what happens here - is that Statsd manages to open a socket / file descriptor before daemonization happnes and that socket gets closed when forking/daemonising. When deamonization happens, the parent/intermediate processes exit right after the fork happens. So even if Statsd managed to open the socket before daemonization - it gets closed right after (and by reinitialising Statsd in this PR we are reopening it in the forked process) But also writing this comment makes me think: If we manage to find out the resource (likely socket) that Statsd opens and get hang of file descriptor for that - there is a better way of handling it https://daemonize.readthedocs.io/en/latest/ - you can add such file descriptor to "keep_fds" list of descriptors that should NOT be closed when forking and their ownership passed to the forked process. So @pavansharma36 - maybe you can investigate what Statsd does under the hood to make it stop working - find the descriptor it opens at initialization and pass it in keep_fds parameter? |
@potiuk I believe that is unnecessary complicating things as not all process are opening stats socket before daemonize, |
Deleted my comment by mistake - but you can likely see it in notification. Generally I am ok with reinitializing as long as we do not have any side effects. Have you looked at those abd can confirm that such reinitialization does not have any ? @pavansharma36 ? |
Yes @potiuk I've verified all cases and works without any issue |
Which cases? out of curiosity? Do you know exactly what causes the problem in the first place? |
related: #33897
Fix services crashing in daemon mode when statsd is enabled.