Skip to content

Commit

Permalink
fix(level): change split key range right key to use ts=0 (#1932)
Browse files Browse the repository at this point in the history
## Problem
This old implementation is not incorrect, but it does not seem to follow
the logic in its description. When a compaction is split into
subcompaction jobs, each of them builds an iterator on all tables
touched by the compaction and iterates over a certain `keyRange` defined
in `addSplits`. In combination, they covers the key range of all the
tables in the compaction. The right key of a key range is intended to be
the right key of a bottom level table (`cd,bot`), It is intended to
include all different versions of that key, as described in the
comments. However, using `math.MaxUint64` will exclude those keys from
this `keyRange` and include them in the next split. The reason is that
the timestamp is encoded as `math.MaUint64-ts` in `y.KeyWithTs`, so a
key with larger ts is actually smaller in `y.CompareKeys`. It can be
corrected by using ts=0. Note that even using ts=math.MaxUint64 is not
going to drop keys unlike what the comments above suggest. because those
keys are covered in the subsequent split and the last split have an
empty right key, iterating till the end of the tables being compacted.

## Solution
Changed the timestamp used for split key range right key from
`math.MaxUint64` to `0`. Updated the comments above it.
  • Loading branch information
hugy718 authored May 9, 2023
1 parent 599ccc6 commit ef0e552
Showing 1 changed file with 5 additions and 6 deletions.
11 changes: 5 additions & 6 deletions levels.go
Original file line number Diff line number Diff line change
Expand Up @@ -1073,14 +1073,13 @@ func (s *levelsController) addSplits(cd *compactDef) {
return
}
if i%width == width-1 {
// Right should always have ts=maxUint64 otherwise we'll lose keys
// in subcompaction. Consider the following.
// Right is assigned ts=0. The encoding ts bytes take MaxUint64-ts,
// so, those with smaller TS will be considered larger for the same key.
// Consider the following.
// Top table is [A1...C3(deleted)]
// bot table is [B1....C2]
// This will generate splits like [A1 ... C2] . Notice that we
// dropped the C3 which is the last key of the top table.
// See TestCompaction/with_split test.
right := y.KeyWithTs(y.ParseKey(t.Biggest()), math.MaxUint64)
// It will generate a split [A1 ... C0], including any records of Key C.
right := y.KeyWithTs(y.ParseKey(t.Biggest()), 0)
addRange(right)
}
}
Expand Down

0 comments on commit ef0e552

Please sign in to comment.