You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In that issue we enabled re-registration at each boot, however:
We only try once and in case of failure we abort
We do not handle the "multiple machines re-booting all at the same time" scenarios, so in some context with hundreds of devices it won't scale well
We are making a distinction between "live system" and "installed system", which adds complexity. Ideally we should retry consistently without having to think about the installation state
One possible solution is to actually remove logic from the elemental-register code and rely on systemd instead, for example by defining StartLimitIntervalSec and StartLimitBurst.
This however won't really cover the "submarine coming to surface after months" scenario.
For that we may need to actually implement an exponential backoff so that we can keep trying (at max once every 30 min?) indefinitely.
The text was updated successfully, but these errors were encountered:
Discussing with @fgiudici , we thought about the need of decoupling the registration state from the installation state.
In #434 we check on the existence of /run/initramfs/cos-state/state.yaml to determine whether we are re-registering or this is an initial registration.
If we can´t find a solution to make registration idempotent, then it would be better to explicitly maintain a registration state somewhere, for example in a file. The state could also keep track of re-registration failure count, if for example we would like to implement an exponential backoff timer for retrials.
This is a follow up of #434
In that issue we enabled re-registration at each boot, however:
One possible solution is to actually remove logic from the
elemental-register
code and rely on systemd instead, for example by defining StartLimitIntervalSec and StartLimitBurst.This however won't really cover the "submarine coming to surface after months" scenario.
For that we may need to actually implement an exponential backoff so that we can keep trying (at max once every 30 min?) indefinitely.
The text was updated successfully, but these errors were encountered: