Incorrect Garbage Collection Due to MinSyncedSeq Determination with Lamport Clocks #723
Labels
bug 🐞
Something isn't working
hard 🧑🔬
Difficult to deal with or require research
protocol changed 📝
Whether the protocol has changed
sdk ⚒️
What happened:
1. Inaccurate Garbage Collection using Lamport Clocks
In Yorkie, the Garbage Collection (GC) feature is designed to eliminate tombstone nodes once every peer has acknowledged the deletion operation (meaning these nodes are no longer referenced by remote peers). The determination of whether all peers have received a deletion operation is based on the time of the min synced sequence change. This change (minSyncedSeq) is determined using the smallest Lamport clocks. However, while Lamport clocks enable total ordering, they lack the capability to distinguish between concurrency and causality of events, leading to inaccuracies in GC decisions.
Consider the scenario where concurrent edits occur:
abd
, with clients A and B both synced up to the change3@B
.bd
, and B addsc
betweenb
andd
.4@B
as synced.4@B
and pushing4@A
. The server registers4@A
as synced.4@B
, triggering GC since min synced seq is4@A
, and theremovedAt
matches4@A
.1
afterb
.5@B
and pulling4@A
.5@B
, but encounters an error since it can't find2@A
.The bug stems from B not yet syncing
4@A
in step 3 but marking it as synced since we use Lamport clock to determine the min synced sequence. Although4@A < 4@B
, it doesn't guarantee the sync of4@A
due to Lamport clock limitations.a->b
=>L(a)<L(b)
L(a)<L(b)
=>a->b
ora||b
(concurrency)2. Proposed Solution Ideas
2-1. Use of Vector Clocks
Replace Lamport clocks with vector clocks to manage the concurrency of changes. Vector clocks maintain an array of counters for each client, representing the ordering of changes.
[2,3]
.[2,4]
for B's sync.V(a)=V(b)
whena_k=b_k
for all kV(a)<V(b)
whena_k<=b_k
for all k andV(a)≠V(b)
V(a)||V(b)
ifa_i<b_i
anda_j>b_j
, some i,jThis approach ensures accurate concurrency tracking but may pose memory challenges as all time tickets need to be managed using vector clocks.
2-2. Hybrid Use of Vector and Lamport Clocks
Introduce a hybrid approach by managing vector clocks for changes' concurrency in the document replica and using Lamport clocks for element ID generation.
This approach balances the benefits of vector clocks for concurrency tracking and Lamport clocks for efficient element ID generation. However, it may still incur increased memory usage as the number of clients grows.
2-2-1. Handling Concurrent Edits in Text Delete
With the adoption of the 2-2 approach, the use of the
latestCreatedAtMap
to handle text delete concurrency can be eliminated.latestCreatedAtMap
.latestCreatedAtMap
): as-is2-3. Exploration of Other Approaches
What you expected to happen:
GC should occur only when all peers have received the delete operation.
How to reproduce it (as minimally and precisely as possible):
Execute the following test code corresponding to the scenario:
Anything else we need to know?:
Environment:
yorkie version
):v0.4.9
The text was updated successfully, but these errors were encountered: