kvserver: clean up proposals pipeline #116020
Labels
A-kv-replication
Relating to Raft, consensus, and coordination.
C-cleanup
Tech debt, refactors, loose ends, etc. Solution not expected to significantly change behavior.
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
T-kv
KV Team
Forking off #115020 (comment).
I'm leaning towards a clearer design, like this:
Have a FIFO queue of in-flight proposals. They are added sequentially, and have consecutive "lease indices", so no fancy data structures are needed (except maybe some MPSC support for concurrent writes, like in the proposal buffer).
refreshProposalsLocked
wants to push duplicates to it. So we may still need 2 separate queues.When LAI state changes, wipe a prefix of this queue up to the new LAI (after reporting all the successfully applied commands). Decide on the unapplied / unknown to be applied commands:
When a replica is destroyed, or lease moves (figure out exactly what those conditions are), wipe any outstanding proposals in the queue and prevent inserts to it. Report ambiguous errors for them.
Not sure if the
proposals
map is needed in the first place. The above data structure is searchable by LAI in O(1) since LAI is strictly +1 incremental. The map would only be needed if we additionally need to search by the proposal ID, but I don't think we need to. We will use the command ID only for sanity checks to make sure that search in the queue by LAI finds the correct command.We do need to track multi-proposal proposals though (those that update LAI and repropose). From the above, we have the invariant that an in-flight proposal is always in the queue and always exactly once, so we can store all this information right in the queue. Or we could factor out some bits that are shared across multiple reproposals.
Do not touch the proposals state (and proposals queue) from
refreshProposalsLocked
at all. All it needs to do is scan the queue, and insert slow proposals to the proposal buffer. This nicely separates concerns. We may replace the queue scanning with a better heap-based approach, so that we don't have to scan the entire queue all the time (as we do now).Jira issue: CRDB-34385
Epic CRDB-39898
The text was updated successfully, but these errors were encountered: