-
Notifications
You must be signed in to change notification settings - Fork 6
Description
Right now, if a ChainIndex implementation returns an error from handle_rollback (or from handle_tx if we do not see a rollback to before the failure), we will halt that index until the server restarts. When it does restart, we will reset the index to its default start point and replay from there. This works, but will result in extremely long recovery times in production for potentially transient errors.
We want to recover from transient errors without this reset.
To do that, we need to
- track the last K points processed by each index
- track the last transaction processed by each index (we currently only track the block)
- if an index was halted before, try rolling forward (or rolling back) from the last point it successfully processed at startup
At startup, the ChainIndexer will find the index with the earliest tip and request a FindIntersect from that index's tip (or the K points before that tip). It will then process new transactions or rollbacks as normal.
When an index sees a transaction from a point before its tip, it will check if that point is in the last K points it processed.
- If the point is older than the index's oldest slot, it will ignore the point.
- If the point is equal to the index's oldest slot and the hash does not match, it will halt (this means we are on a fork more than K blocks long).
- If the point is newer than the index's oldest slot and the hash does not match, it will roll back to the point before the mismatch, and then process the new transaction.
NB: we are not adding this to Milestone 1, but treating it as a fast-follow