Improve state consistency around observers and next/handled event sequence number for when multiple partitions are working and for some reason the server goes down #1682
Labels
observers
Issues related to event sequence observers
reliability
Capabilities related to guaranteeing reliability in a running system typically related to up-time
Today we store the
NextSequenceNumber
on theObserverState
. If you have something usingAppendMany()
with multiple partitions and for some reason the server dies in the middle of handling the events, it will not know what events it has handled.To remedy this situation we should do a couple of things:
State
Introduce a separate state (not inside
ObserverState
) were we store theEventSequenceNumber
of any events being handled while handling them. This means, we store the events sequence number before we handle it. After we've handled it, we can then remove this state.Special state for the StateMachine
During startup / subscription of an observer, we would need to look at the specialized Observer state and see if we have any of these. We would then enter a
CatchUpPartitions
(or similar) type of state where we catch up the partitions that needs to be caught up. Once they are caught up, we would enter theRouting
state.The text was updated successfully, but these errors were encountered: