You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Monitored endpoints rely on expectation: SC expects that the heartbeats for X number of endpoints are received within a timeout period, based on prior registration of the endpoints (see #15).
If we do not have a persistent list of endpoints we will encounter the following (bad and inconsistent) scenario:
Opie restarts the system, including 10 endpoints, SC and SP.
Opie does some ops work and restarts all the endpoints, SC and SP
Due to a configuration issue, some of the endpoints fail to start
SC and SP restart properly
Since SC's list of endpoints was zero'd, the heartbeat indicator is green and there are no events indicating that the endpoints failed, and all seems to be well
The only indication that there is a major problem is the small number indicating the number of active endpoints;
Only if Opie remembers the number of endpoints he needs to monitor, and only if he pays attention, will he understand that the green indicator for the active endpoints hides a very big problem.
The number of expected endpoints in SP is zero'd when SC is restarted.
The text was updated successfully, but these errors were encountered: