-
Notifications
You must be signed in to change notification settings - Fork 121
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Disabling harvester_activity_handler does not disable harvester checks #361
Comments
Looking at the code, it seems the keep_alive_monitor is hardcoded to assume a harvester: https://github.com/martomi/chiadog/blob/main/src/notifier/keep_alive_monitor.py#L30 |
After a deeper peek, the root cause is a bit deeper. There's only one handler that emits KEEPALIVE events, which is So the "proper" solution to this would be to add KEEPALIVE events to the other handlers when they see activity proving health for that service. However in scope of not spamming harvester being down when one doesn't exist on purpose, I'd say the keep_alive_monitor needs to be initialized empty instead of always assuming a harvester is present. That way the first keepalive seen from a service starts monitoring for it. I'll have a poke at making a PR. |
You’re on the right path :) I added the config option for disabling handlers (relatively) recently in version 0.7.1 (here). And indeed, that config only applies to the actual log handlers but doesn’t affect the behavior of the keep_alive_monitor. Note that there’s also a long-standing thread on this here: #93 Happy to look into a PR if you come up with some proposal to solve this. I think KEEPALIVE events can only come from either the signage point handler or the harvester handler? And we should keep defaulting to the harvester since in many setups where chiadog is installed there’s only a harvester and no full node. |
The way I'm looking at the keepalive setup, any service posting KEEPALIVE events would prime the system for that service. Relying on that fully however means that a service that was already down at the time of chiadog starting is never notified as down. I'll have a go of just reading the enabled handlers from the config and prepopping based on that, which would mean that with a default config, all keepalives are expected. This of course means I'll need to ensure all services do have a keepalive event sender :D Thanks for the thread context! I think it's plausible to manufacture keepalive signals for any service, as long as there's a log message we know of that proves it's functioning. For the wallet for example we could parse a message like:
and check that the diff between the timestamp value and us reading the message is less than say 30 seconds, which gives a reasonable estimate that the wallet is healthy and up to speed. |
Agreed! |
@martomi:
If you can share any insight into what matches the style of the project as the direction of evolution, I can then make more informed proposals. :-) |
Thank you for the detailed write-up - really appreciate the thoughtfulness about the direction and future implications! Handler GroupingsThe mapping
Where And We could also go for a more nested approach in the config as in ConfigI’m 100% onboard with handling the tech debt in the YAML configuration as a potential first step. In the past, we’ve been careful to maintain backwards compatibility for folks by adding optional configs with sane defaults. The problem is that those defaults are configured at the point of usage of the respective config values. It should really happen in the Note that, already today, the E.g. this should work
Direction of EvolutionYou may have noticed that the project is mostly in a feature-freeze for the past 1+ year. I’ve been lurking around and making sure we’re merging and releasing bugfixes to not degrade the existing experience. In general I’m happy to discuss and support community-driven evolution as long as the changes are thoughtful and (most importantly) don’t increase the maintenance burden. 😄 It should also be noted that Chia Network itself is investing resources into an official chia farming monitoring tools (video from JM). Which means that the long-term usefulness of Trying to be maximally transparent so you can calibrate how much you would like to invest in the architecture & extensibility here. Otherwise, I’m happy to move in the proposed direction! |
Thanks for the details and disclosure. I think there's enough here for me to continue. It sounds like we're pretty aligned so I'll start working on the config migration first with an emphasis on as much backwards compatibility as sane. |
Note to self I learned the hard way during testing: Also need a keepalive check for checking that the logfile itself is still flowing. Hooking up to a log file that isn't updating at all creates a very eerie scenario and while it does trigger keepalives, that can be misleading if instead it's not really a service not working, and actually that logging itself has "failed". |
Even if the handler
harvester_activity_handler
is set toenable: false
, a harvester is still expected:Your harvester appears to be offline! No events for the past 600 seconds.
services="node farmer-only wallet"
, i.e. it's not running a harvester on purpose since there are no local plots.DEBUG
orINFO
, have not tested other levels.Relevant config sections:
Environment:
ghcr.io/chia-network/chia:latest
1.6.2
v0.7.5
as present insha256:4c19849657d2fc78c91257f1c18c8a18ac6a99d094e4d6c3e530300d3905a291
The text was updated successfully, but these errors were encountered: