Skip to content

Commit

Permalink
storage,kv: faster intent resolution over a key range
Browse files Browse the repository at this point in the history
When all the intents on a range are known to be separated,
as indicated by MCCStats, and a ResolveIntentRangeRequest
needs to be processed (for a transaction that did enough
writes that we did not track individual intent keys), we
avoid iterating over MVCC key-value pairs to find those
intents, and instead iterate over the separated lock table
key space.

A new iterForKeyVersions interface is introduced which
provides a narrow iterator interface to be used by
mvccResolveWriteIntent that resolves individual intents.
This interface is only for iterating over a single key,
and is implemented by the usual MVCCIterator and by the
new separatedIntentAndVersionIter. The latter serves
both as a way to find intents over a range and for
individual intent resolution, in an optimized manner.

There are new benchmarks which vary the sparseness
(sparseness of N means that 1 of every N key has an
intent that is for the txn for which intent resolution
is being performed) and whether the remaining N-1 keys
have an intent from a different transaction
(other-txn-intents=true) or not (the former results
in the cost of at least parsing the MVCCMetadata to
compare txn IDs).

Full results are below. Two things to note are:
- The percent-flushed={0,50} sometimes show a regression.
  These are atypical cases where the SingleDelete to
  remove the intent has not been flushed so the intent
  Set and SingleDelete are both alive. This would be
  a pathalogical case when the LSM is unhealthy. Even
  then the regression is < 170%.
- The percent-flushed=100 cases are the typical one.
  Once sparseness increases to 100 or 1000 we see speed
  gains close to 100x, as indicated by the following:

```
IntentRangeResolution/separated=true/versions=10/sparseness=1/other-txn-intents=false/percent-flushed=100-16        235µs ± 2%      58µs ± 3%   -75.52%  (p=0.008 n=5+5)
IntentRangeResolution/separated=true/versions=10/sparseness=100/other-txn-intents=false/percent-flushed=100-16      154µs ± 2%       2µs ± 4%   -98.72%  (p=0.008 n=5+5)
IntentRangeResolution/separated=true/versions=10/sparseness=1000/other-txn-intents=false/percent-flushed=100-16     137µs ± 1%       1µs ± 2%   -99.07%  (p=0.008 n=5+5)
```

Full results:
```
name                                                                                                             old time/op    new time/op    delta
IntentRangeResolution/separated=true/versions=10/sparseness=1/other-txn-intents=false/percent-flushed=0-16          450µs ± 5%     234µs ± 2%   -48.07%  (p=0.008 n=5+5)
IntentRangeResolution/separated=true/versions=10/sparseness=1/other-txn-intents=false/percent-flushed=50-16         294µs ± 3%     101µs ± 4%   -65.49%  (p=0.008 n=5+5)
IntentRangeResolution/separated=true/versions=10/sparseness=1/other-txn-intents=false/percent-flushed=100-16        235µs ± 2%      58µs ± 3%   -75.52%  (p=0.008 n=5+5)
IntentRangeResolution/separated=true/versions=10/sparseness=100/other-txn-intents=false/percent-flushed=0-16        494µs ± 7%     218µs ± 6%   -55.84%  (p=0.008 n=5+5)
IntentRangeResolution/separated=true/versions=10/sparseness=100/other-txn-intents=false/percent-flushed=50-16       251µs ± 2%      57µs ± 3%   -77.29%  (p=0.008 n=5+5)
IntentRangeResolution/separated=true/versions=10/sparseness=100/other-txn-intents=false/percent-flushed=100-16      154µs ± 2%       2µs ± 4%   -98.72%  (p=0.008 n=5+5)
IntentRangeResolution/separated=true/versions=10/sparseness=100/other-txn-intents=true/percent-flushed=0-16         343µs ± 5%     236µs ± 2%   -31.28%  (p=0.008 n=5+5)
IntentRangeResolution/separated=true/versions=10/sparseness=100/other-txn-intents=true/percent-flushed=50-16        240µs ± 1%      74µs ± 2%   -69.02%  (p=0.008 n=5+5)
IntentRangeResolution/separated=true/versions=10/sparseness=100/other-txn-intents=true/percent-flushed=100-16       203µs ± 1%      30µs ± 1%   -84.97%  (p=0.008 n=5+5)
IntentRangeResolution/separated=true/versions=10/sparseness=1000/other-txn-intents=false/percent-flushed=0-16       516µs ± 4%     274µs ±13%   -46.91%  (p=0.008 n=5+5)
IntentRangeResolution/separated=true/versions=10/sparseness=1000/other-txn-intents=false/percent-flushed=50-16      256µs ± 2%      57µs ± 1%   -77.67%  (p=0.016 n=4+5)
IntentRangeResolution/separated=true/versions=10/sparseness=1000/other-txn-intents=false/percent-flushed=100-16     137µs ± 1%       1µs ± 2%   -99.07%  (p=0.008 n=5+5)
IntentRangeResolution/separated=true/versions=10/sparseness=1000/other-txn-intents=true/percent-flushed=0-16        325µs ± 2%     237µs ± 1%   -26.98%  (p=0.008 n=5+5)
IntentRangeResolution/separated=true/versions=10/sparseness=1000/other-txn-intents=true/percent-flushed=50-16       232µs ± 4%      72µs ± 1%   -68.89%  (p=0.008 n=5+5)
IntentRangeResolution/separated=true/versions=10/sparseness=1000/other-txn-intents=true/percent-flushed=100-16      199µs ± 1%      30µs ± 0%   -84.87%  (p=0.016 n=5+4)
IntentRangeResolution/separated=true/versions=100/sparseness=1/other-txn-intents=false/percent-flushed=0-16        2.08ms ± 1%    2.68ms ± 8%   +28.57%  (p=0.016 n=4+5)
IntentRangeResolution/separated=true/versions=100/sparseness=1/other-txn-intents=false/percent-flushed=50-16        592µs ± 4%     596µs ± 7%      ~     (p=1.000 n=5+5)
IntentRangeResolution/separated=true/versions=100/sparseness=1/other-txn-intents=false/percent-flushed=100-16       242µs ± 2%      59µs ± 3%   -75.73%  (p=0.016 n=4+5)
IntentRangeResolution/separated=true/versions=100/sparseness=100/other-txn-intents=false/percent-flushed=0-16      3.04ms ± 5%    2.60ms ± 3%   -14.59%  (p=0.008 n=5+5)
IntentRangeResolution/separated=true/versions=100/sparseness=100/other-txn-intents=false/percent-flushed=50-16      840µs ± 4%     539µs ± 3%   -35.87%  (p=0.008 n=5+5)
IntentRangeResolution/separated=true/versions=100/sparseness=100/other-txn-intents=false/percent-flushed=100-16     163µs ± 6%       2µs ± 6%   -98.75%  (p=0.008 n=5+5)
IntentRangeResolution/separated=true/versions=100/sparseness=100/other-txn-intents=true/percent-flushed=0-16        936µs ±38%    2497µs ±12%  +166.83%  (p=0.008 n=5+5)
IntentRangeResolution/separated=true/versions=100/sparseness=100/other-txn-intents=true/percent-flushed=50-16       281µs ±12%     551µs ± 2%   +95.64%  (p=0.008 n=5+5)
IntentRangeResolution/separated=true/versions=100/sparseness=100/other-txn-intents=true/percent-flushed=100-16      209µs ±13%      31µs ± 9%   -84.95%  (p=0.008 n=5+5)
IntentRangeResolution/separated=true/versions=100/sparseness=1000/other-txn-intents=false/percent-flushed=0-16     3.04ms ± 6%    2.55ms ± 0%      ~     (p=0.333 n=5+1)

name                                                                                                             old alloc/op   new alloc/op   delta
IntentRangeResolution/separated=true/versions=10/sparseness=1/other-txn-intents=false/percent-flushed=0-16         12.8kB ± 0%    13.0kB ± 0%    +0.92%  (p=0.008 n=5+5)
IntentRangeResolution/separated=true/versions=10/sparseness=1/other-txn-intents=false/percent-flushed=50-16        12.8kB ± 0%    13.0kB ± 0%    +0.94%  (p=0.016 n=5+4)
IntentRangeResolution/separated=true/versions=10/sparseness=1/other-txn-intents=false/percent-flushed=100-16       12.9kB ± 0%    13.0kB ± 0%    +0.59%  (p=0.016 n=5+4)
IntentRangeResolution/separated=true/versions=10/sparseness=100/other-txn-intents=false/percent-flushed=0-16       1.75kB ± 0%    0.29kB ± 0%      ~     (p=0.079 n=4+5)
IntentRangeResolution/separated=true/versions=10/sparseness=100/other-txn-intents=false/percent-flushed=50-16      1.75kB ± 0%    0.29kB ± 0%   -83.18%  (p=0.008 n=5+5)
IntentRangeResolution/separated=true/versions=10/sparseness=100/other-txn-intents=false/percent-flushed=100-16     1.75kB ± 0%    0.29kB ± 0%   -83.22%  (p=0.008 n=5+5)
IntentRangeResolution/separated=true/versions=10/sparseness=100/other-txn-intents=true/percent-flushed=0-16        12.8kB ± 0%    11.4kB ± 0%   -11.37%  (p=0.008 n=5+5)
IntentRangeResolution/separated=true/versions=10/sparseness=100/other-txn-intents=true/percent-flushed=50-16       12.8kB ± 0%    11.4kB ± 0%      ~     (p=0.079 n=4+5)
IntentRangeResolution/separated=true/versions=10/sparseness=100/other-txn-intents=true/percent-flushed=100-16      12.8kB ± 0%    11.4kB ± 0%   -11.41%  (p=0.000 n=5+4)
IntentRangeResolution/separated=true/versions=10/sparseness=1000/other-txn-intents=false/percent-flushed=0-16      1.66kB ± 0%    0.17kB ± 0%   -89.61%  (p=0.008 n=5+5)
IntentRangeResolution/separated=true/versions=10/sparseness=1000/other-txn-intents=false/percent-flushed=50-16     1.66kB ± 0%    0.17kB ± 0%   -89.61%  (p=0.008 n=5+5)
IntentRangeResolution/separated=true/versions=10/sparseness=1000/other-txn-intents=false/percent-flushed=100-16    1.66kB ± 0%    0.17kB ± 0%   -89.63%  (p=0.008 n=5+5)
IntentRangeResolution/separated=true/versions=10/sparseness=1000/other-txn-intents=true/percent-flushed=0-16       12.8kB ± 0%    11.4kB ± 0%   -11.50%  (p=0.008 n=5+5)
IntentRangeResolution/separated=true/versions=10/sparseness=1000/other-txn-intents=true/percent-flushed=50-16      12.8kB ± 0%    11.4kB ± 0%      ~     (p=0.079 n=4+5)
IntentRangeResolution/separated=true/versions=10/sparseness=1000/other-txn-intents=true/percent-flushed=100-16     12.8kB ± 0%    11.4kB ± 0%   -11.53%  (p=0.029 n=4+4)
IntentRangeResolution/separated=true/versions=100/sparseness=1/other-txn-intents=false/percent-flushed=0-16        12.8kB ± 0%    13.0kB ± 0%    +0.95%  (p=0.016 n=5+4)
IntentRangeResolution/separated=true/versions=100/sparseness=1/other-txn-intents=false/percent-flushed=50-16       12.8kB ± 0%    13.0kB ± 0%    +0.95%  (p=0.008 n=5+5)
IntentRangeResolution/separated=true/versions=100/sparseness=1/other-txn-intents=false/percent-flushed=100-16      12.9kB ± 0%    13.0kB ± 0%    +0.54%  (p=0.016 n=5+4)
IntentRangeResolution/separated=true/versions=100/sparseness=100/other-txn-intents=false/percent-flushed=0-16      1.75kB ± 0%    0.29kB ± 0%   -83.14%  (p=0.008 n=5+5)
IntentRangeResolution/separated=true/versions=100/sparseness=100/other-txn-intents=false/percent-flushed=50-16     1.75kB ± 0%    0.29kB ± 0%   -83.19%  (p=0.008 n=5+5)
IntentRangeResolution/separated=true/versions=100/sparseness=100/other-txn-intents=false/percent-flushed=100-16    1.75kB ± 0%    0.29kB ± 0%   -83.22%  (p=0.008 n=5+5)
IntentRangeResolution/separated=true/versions=100/sparseness=100/other-txn-intents=true/percent-flushed=0-16       12.8kB ± 0%    11.4kB ± 0%   -11.37%  (p=0.000 n=5+4)
IntentRangeResolution/separated=true/versions=100/sparseness=100/other-txn-intents=true/percent-flushed=50-16      12.8kB ± 0%    11.4kB ± 0%   -11.37%  (p=0.008 n=5+5)
IntentRangeResolution/separated=true/versions=100/sparseness=100/other-txn-intents=true/percent-flushed=100-16     12.8kB ± 0%    11.4kB ± 0%   -11.41%  (p=0.000 n=5+4)
IntentRangeResolution/separated=true/versions=100/sparseness=1000/other-txn-intents=false/percent-flushed=0-16     1.66kB ± 0%    0.17kB ± 0%   -89.56%  (p=0.000 n=5+1)

name                                                                                                             old allocs/op  new allocs/op  delta
IntentRangeResolution/separated=true/versions=10/sparseness=1/other-txn-intents=false/percent-flushed=0-16            401 ± 0%       404 ± 0%    +0.75%  (p=0.008 n=5+5)
IntentRangeResolution/separated=true/versions=10/sparseness=1/other-txn-intents=false/percent-flushed=50-16           401 ± 0%       404 ± 0%    +0.75%  (p=0.008 n=5+5)
IntentRangeResolution/separated=true/versions=10/sparseness=1/other-txn-intents=false/percent-flushed=100-16          401 ± 0%       404 ± 0%    +0.75%  (p=0.008 n=5+5)
IntentRangeResolution/separated=true/versions=10/sparseness=100/other-txn-intents=false/percent-flushed=0-16          104 ± 0%         8 ± 0%   -92.31%  (p=0.008 n=5+5)
IntentRangeResolution/separated=true/versions=10/sparseness=100/other-txn-intents=false/percent-flushed=50-16         104 ± 0%         8 ± 0%   -92.31%  (p=0.008 n=5+5)
IntentRangeResolution/separated=true/versions=10/sparseness=100/other-txn-intents=false/percent-flushed=100-16        104 ± 0%         8 ± 0%   -92.31%  (p=0.008 n=5+5)
IntentRangeResolution/separated=true/versions=10/sparseness=100/other-txn-intents=true/percent-flushed=0-16           401 ± 0%       305 ± 0%   -23.94%  (p=0.008 n=5+5)
IntentRangeResolution/separated=true/versions=10/sparseness=100/other-txn-intents=true/percent-flushed=50-16          401 ± 0%       305 ± 0%   -23.94%  (p=0.008 n=5+5)
IntentRangeResolution/separated=true/versions=10/sparseness=100/other-txn-intents=true/percent-flushed=100-16         401 ± 0%       305 ± 0%   -23.94%  (p=0.008 n=5+5)
IntentRangeResolution/separated=true/versions=10/sparseness=1000/other-txn-intents=false/percent-flushed=0-16         102 ± 0%         3 ± 0%   -97.06%  (p=0.008 n=5+5)
IntentRangeResolution/separated=true/versions=10/sparseness=1000/other-txn-intents=false/percent-flushed=50-16        102 ± 0%         3 ± 0%   -97.06%  (p=0.008 n=5+5)
IntentRangeResolution/separated=true/versions=10/sparseness=1000/other-txn-intents=false/percent-flushed=100-16       102 ± 0%         3 ± 0%   -97.06%  (p=0.008 n=5+5)
IntentRangeResolution/separated=true/versions=10/sparseness=1000/other-txn-intents=true/percent-flushed=0-16          401 ± 0%       303 ± 0%   -24.44%  (p=0.008 n=5+5)
IntentRangeResolution/separated=true/versions=10/sparseness=1000/other-txn-intents=true/percent-flushed=50-16         401 ± 0%       303 ± 0%   -24.44%  (p=0.008 n=5+5)
IntentRangeResolution/separated=true/versions=10/sparseness=1000/other-txn-intents=true/percent-flushed=100-16        401 ± 0%       303 ± 0%   -24.44%  (p=0.008 n=5+5)
IntentRangeResolution/separated=true/versions=100/sparseness=1/other-txn-intents=false/percent-flushed=0-16           401 ± 0%       404 ± 0%    +0.75%  (p=0.008 n=5+5)
IntentRangeResolution/separated=true/versions=100/sparseness=1/other-txn-intents=false/percent-flushed=50-16          401 ± 0%       404 ± 0%    +0.75%  (p=0.008 n=5+5)
IntentRangeResolution/separated=true/versions=100/sparseness=1/other-txn-intents=false/percent-flushed=100-16         401 ± 0%       404 ± 0%    +0.75%  (p=0.008 n=5+5)
IntentRangeResolution/separated=true/versions=100/sparseness=100/other-txn-intents=false/percent-flushed=0-16         104 ± 0%         8 ± 0%   -92.31%  (p=0.008 n=5+5)
IntentRangeResolution/separated=true/versions=100/sparseness=100/other-txn-intents=false/percent-flushed=50-16        104 ± 0%         8 ± 0%   -92.31%  (p=0.008 n=5+5)
IntentRangeResolution/separated=true/versions=100/sparseness=100/other-txn-intents=false/percent-flushed=100-16       104 ± 0%         8 ± 0%   -92.31%  (p=0.008 n=5+5)
IntentRangeResolution/separated=true/versions=100/sparseness=100/other-txn-intents=true/percent-flushed=0-16          401 ± 0%       305 ± 0%   -23.94%  (p=0.008 n=5+5)
IntentRangeResolution/separated=true/versions=100/sparseness=100/other-txn-intents=true/percent-flushed=50-16         401 ± 0%       305 ± 0%   -23.94%  (p=0.008 n=5+5)
IntentRangeResolution/separated=true/versions=100/sparseness=100/other-txn-intents=true/percent-flushed=100-16        401 ± 0%       305 ± 0%   -23.94%  (p=0.008 n=5+5)
IntentRangeResolution/separated=true/versions=100/sparseness=1000/other-txn-intents=false/percent-flushed=0-16        102 ± 0%         3 ± 0%   -97.06%  (p=0.000 n=5+1)
```

Release note (performance improvement): Intent resolution for transactions
that write many intents such that we track intent ranges, for the purpose
of intent resolution, is much faster (potentially 100x) when using the
separated lock table.

Release justification: Fix for high-severity poor performance in existing
functionality, and is a low risk, high benefit change, that potentially
improves ranged intent resolution speed by 100x.
  • Loading branch information
sumeerbhola committed Aug 30, 2021
1 parent cb6889c commit f2ee987
Show file tree
Hide file tree
Showing 16 changed files with 648 additions and 128 deletions.
2 changes: 1 addition & 1 deletion pkg/kv/kvserver/batch_spanset_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -596,7 +596,7 @@ func TestSpanSetMVCCResolveWriteIntentRange(t *testing.T) {
Status: roachpb.PENDING,
}
if _, _, err := storage.MVCCResolveWriteIntentRange(
ctx, batch, nil /* ms */, intent, 0,
ctx, batch, nil /* ms */, intent, 0, eng.IsSeparatedIntentsEnabledForTesting(ctx),
); err != nil {
t.Fatal(err)
}
Expand Down
1 change: 1 addition & 0 deletions pkg/kv/kvserver/batcheval/BUILD.bazel
Original file line number Diff line number Diff line change
Expand Up @@ -116,6 +116,7 @@ go_test(
embed = [":batcheval"],
deps = [
"//pkg/base",
"//pkg/clusterversion",
"//pkg/keys",
"//pkg/kv",
"//pkg/kv/kvserver",
Expand Down
9 changes: 8 additions & 1 deletion pkg/kv/kvserver/batcheval/cmd_end_transaction.go
Original file line number Diff line number Diff line change
Expand Up @@ -456,6 +456,13 @@ func resolveLocalLocks(
// These transactions rely on having their locks resolved synchronously.
resolveAllowance = math.MaxInt64
}
onlySeparatedIntents := false
st := evalCtx.ClusterSettings()
// Some tests have st == nil.
if st != nil {
onlySeparatedIntents = st.Version.ActiveVersionOrEmpty(ctx).IsActive(
clusterversion.PostSeparatedIntentsMigration)
}
for _, span := range args.LockSpans {
if err := func() error {
if resolveAllowance == 0 {
Expand Down Expand Up @@ -496,7 +503,7 @@ func resolveLocalLocks(
if inSpan != nil {
update.Span = *inSpan
num, resumeSpan, err := storage.MVCCResolveWriteIntentRange(
ctx, readWriter, ms, update, resolveAllowance)
ctx, readWriter, ms, update, resolveAllowance, onlySeparatedIntents)
if err != nil {
return err
}
Expand Down
6 changes: 5 additions & 1 deletion pkg/kv/kvserver/batcheval/cmd_resolve_intent_range.go
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ package batcheval
import (
"context"

"github.com/cockroachdb/cockroach/pkg/clusterversion"
"github.com/cockroachdb/cockroach/pkg/kv/kvserver/batcheval/result"
"github.com/cockroachdb/cockroach/pkg/kv/kvserver/spanset"
"github.com/cockroachdb/cockroach/pkg/roachpb"
Expand Down Expand Up @@ -44,8 +45,11 @@ func ResolveIntentRange(

update := args.AsLockUpdate()

onlySeparatedIntents :=
cArgs.EvalCtx.ClusterSettings().Version.ActiveVersionOrEmpty(ctx).IsActive(
clusterversion.PostSeparatedIntentsMigration)
numKeys, resumeSpan, err := storage.MVCCResolveWriteIntentRange(
ctx, readWriter, ms, update, h.MaxSpanRequestKeys)
ctx, readWriter, ms, update, h.MaxSpanRequestKeys, onlySeparatedIntents)
if err != nil {
return result.Result{}, err
}
Expand Down
36 changes: 33 additions & 3 deletions pkg/kv/kvserver/batcheval/cmd_resolve_intent_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -16,9 +16,12 @@ import (
"strings"
"testing"

"github.com/cockroachdb/cockroach/pkg/clusterversion"
"github.com/cockroachdb/cockroach/pkg/keys"
"github.com/cockroachdb/cockroach/pkg/kv/kvserver/abortspan"
"github.com/cockroachdb/cockroach/pkg/kv/kvserver/spanset"
"github.com/cockroachdb/cockroach/pkg/roachpb"
"github.com/cockroachdb/cockroach/pkg/settings/cluster"
"github.com/cockroachdb/cockroach/pkg/storage"
"github.com/cockroachdb/cockroach/pkg/storage/enginepb"
"github.com/cockroachdb/cockroach/pkg/testutils"
Expand Down Expand Up @@ -76,6 +79,7 @@ func TestDeclareKeysResolveIntent(t *testing.T) {
}
ctx := context.Background()
engine := storage.NewDefaultInMemForTesting()
st := makeClusterSettingsUsingEngineIntentsSetting(engine)
defer engine.Close()
testutils.RunTrueAndFalse(t, "ranged", func(t *testing.T, ranged bool) {
for _, test := range tests {
Expand Down Expand Up @@ -105,7 +109,7 @@ func TestDeclareKeysResolveIntent(t *testing.T) {
h.RangeID = desc.RangeID

cArgs := CommandArgs{Header: h}
cArgs.EvalCtx = (&MockEvalCtx{AbortSpan: as}).EvalContext()
cArgs.EvalCtx = (&MockEvalCtx{ClusterSettings: st, AbortSpan: as}).EvalContext()

if !ranged {
cArgs.Args = &ri
Expand All @@ -116,6 +120,7 @@ func TestDeclareKeysResolveIntent(t *testing.T) {
} else {
cArgs.Args = &rir
declareKeysResolveIntentRange(&desc, h, &rir, &latchSpans, &lockSpans)
addLockTableSpansForRangedIntentResolution(rir, &latchSpans)
if _, err := ResolveIntentRange(ctx, batch, cArgs, &roachpb.ResolveIntentRangeResponse{}); err != nil {
t.Fatal(err)
}
Expand Down Expand Up @@ -157,6 +162,7 @@ func TestResolveIntentAfterPartialRollback(t *testing.T) {
defer db.Close()
batch := db.NewBatch()
defer batch.Close()
st := makeClusterSettingsUsingEngineIntentsSetting(db)

var v roachpb.Value
// Write a first value at key.
Expand Down Expand Up @@ -185,6 +191,9 @@ func TestResolveIntentAfterPartialRollback(t *testing.T) {

var spans spanset.SpanSet
rbatch := db.NewBatch()
// The spans will be used for validating that reads and writes are
// consistent with the declared spans. We initialize spans below, before
// performing reads and writes.
rbatch = spanset.NewBatch(rbatch, &spans)
defer rbatch.Close()

Expand All @@ -202,7 +211,7 @@ func TestResolveIntentAfterPartialRollback(t *testing.T) {
if _, err := ResolveIntent(ctx, rbatch,
CommandArgs{
Header: h,
EvalCtx: (&MockEvalCtx{}).EvalContext(),
EvalCtx: (&MockEvalCtx{ClusterSettings: st}).EvalContext(),
Args: &ri,
},
&roachpb.ResolveIntentResponse{},
Expand All @@ -220,12 +229,13 @@ func TestResolveIntentAfterPartialRollback(t *testing.T) {
rir.EndKey = endKey

declareKeysResolveIntentRange(&desc, h, &rir, &spans, nil)
addLockTableSpansForRangedIntentResolution(rir, &spans)

h.MaxSpanRequestKeys = 10
if _, err := ResolveIntentRange(ctx, rbatch,
CommandArgs{
Header: h,
EvalCtx: (&MockEvalCtx{}).EvalContext(),
EvalCtx: (&MockEvalCtx{ClusterSettings: st}).EvalContext(),
Args: &rir,
},
&roachpb.ResolveIntentRangeResponse{},
Expand Down Expand Up @@ -261,3 +271,23 @@ func TestResolveIntentAfterPartialRollback(t *testing.T) {
}
})
}

func makeClusterSettingsUsingEngineIntentsSetting(engine storage.Engine) *cluster.Settings {
version := clusterversion.TestingBinaryVersion
if !engine.IsSeparatedIntentsEnabledForTesting(context.Background()) {
// Before SeparatedIntentsMigration, so intent resolution will not assume
// only separated intents.
version = clusterversion.ByKey(clusterversion.SeparatedIntentsMigration - 1)
}
return cluster.MakeTestingClusterSettingsWithVersions(version, version, true)
}

func addLockTableSpansForRangedIntentResolution(
rir roachpb.ResolveIntentRangeRequest, spans *spanset.SpanSet,
) {
// This is similar to the span logic in Replica.newBatchedEngine.
start, _ := keys.LockTableSingleKey(rir.Key, nil)
end, _ := keys.LockTableSingleKey(rir.EndKey, nil)
spans.AddNonMVCC(spanset.SpanReadWrite, roachpb.Span{
Key: start, EndKey: end})
}
6 changes: 4 additions & 2 deletions pkg/kv/kvserver/batcheval/transaction_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -734,9 +734,11 @@ func TestUpdateAbortSpan(t *testing.T) {
defer db.Close()
batch := db.NewBatch()
defer batch.Close()
st := makeClusterSettingsUsingEngineIntentsSetting(db)
evalCtx := &MockEvalCtx{
Desc: &desc,
AbortSpan: as,
ClusterSettings: st,
Desc: &desc,
AbortSpan: as,
CanCreateTxn: func() (bool, hlc.Timestamp, roachpb.TransactionAbortedReason) {
return true, hlc.Timestamp{}, 0
},
Expand Down
10 changes: 8 additions & 2 deletions pkg/kv/kvserver/replica_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -199,7 +199,8 @@ func (tc *testContext) Clock() *hlc.Clock {
// entire keyspace.
func (tc *testContext) Start(t testing.TB, stopper *stop.Stopper) {
tc.manualClock = hlc.NewManualClock(123)
cfg := TestStoreConfig(hlc.NewClock(tc.manualClock.UnixNano, time.Nanosecond))
cfg := TestStoreConfigWithRandomizedClusterSeparatedIntentsMigration(
hlc.NewClock(tc.manualClock.UnixNano, time.Nanosecond))
// testContext tests like to move the manual clock around and assume that they can write at past
// timestamps.
cfg.TestingKnobs.DontCloseTimestamps = true
Expand Down Expand Up @@ -233,12 +234,17 @@ func (tc *testContext) StartWithStoreConfigAndVersion(
tc.gossip = gossip.NewTest(1, rpcContext, server, stopper, metric.NewRegistry(), cfg.DefaultZoneConfig)
}
if tc.engine == nil {
disableSeparatedIntents :=
!cfg.Settings.Version.ActiveVersionOrEmpty(context.Background()).IsActive(
clusterversion.PostSeparatedIntentsMigration)
log.Infof(context.Background(), "engine creation is randomly setting disableSeparatedIntents: %t",
disableSeparatedIntents)
var err error
tc.engine, err = storage.Open(context.Background(),
storage.InMemory(),
storage.Attributes(roachpb.Attributes{Attrs: []string{"dc1", "mem"}}),
storage.MaxSize(1<<20),
storage.ForTesting)
storage.SetSeparatedIntents(disableSeparatedIntents))
if err != nil {
t.Fatal(err)
}
Expand Down
29 changes: 26 additions & 3 deletions pkg/kv/kvserver/replica_write.go
Original file line number Diff line number Diff line change
Expand Up @@ -675,7 +675,7 @@ func (r *Replica) evaluateWriteBatchWrapper(
lul hlc.Timestamp,
latchSpans, lockSpans *spanset.SpanSet,
) (storage.Batch, *roachpb.BatchResponse, result.Result, *roachpb.Error) {
batch, opLogger := r.newBatchedEngine(latchSpans, lockSpans)
batch, opLogger := r.newBatchedEngine(ba, latchSpans, lockSpans)
br, res, pErr := evaluateBatch(ctx, idKey, batch, rec, ms, ba, lul, false /* readOnly */)
if pErr == nil {
if opLogger != nil {
Expand All @@ -692,7 +692,7 @@ func (r *Replica) evaluateWriteBatchWrapper(
// OpLogger is attached to the returned engine.Batch, recording all operations.
// Its recording should be attached to the Result of request evaluation.
func (r *Replica) newBatchedEngine(
latchSpans, lockSpans *spanset.SpanSet,
ba *roachpb.BatchRequest, latchSpans, lockSpans *spanset.SpanSet,
) (storage.Batch, *storage.OpLoggerBatch) {
batch := r.store.Engine().NewBatch()
if !batch.ConsistentIterators() {
Expand Down Expand Up @@ -741,14 +741,37 @@ func (r *Replica) newBatchedEngine(
// To account for separated intent accesses, we translate the lock spans
// to lock table spans.
spans := latchSpans.Copy()
lockSpans.Iterate(func(sa spanset.SpanAccess, _ spanset.SpanScope, span spanset.Span) {
addLockTableSpan := func(sa spanset.SpanAccess, span spanset.Span) {
ltKey, _ := keys.LockTableSingleKey(span.Key, nil)
var ltEndKey roachpb.Key
if span.EndKey != nil {
ltEndKey, _ = keys.LockTableSingleKey(span.EndKey, nil)
}
spans.AddNonMVCC(sa, roachpb.Span{Key: ltKey, EndKey: ltEndKey})
}
lockSpans.Iterate(func(sa spanset.SpanAccess, _ spanset.SpanScope, span spanset.Span) {
addLockTableSpan(sa, span)
})
// The lock spans are insufficient for ranged intent resolution, which
// does not declare lock spans and directly calls
// spanSetBatch.NewEngineIterator.
//
// TODO(sumeer): we can't keep adding additional cases here -- come up
// with something cleaner.
for _, union := range ba.Requests {
inner := union.GetInner()
switch req := inner.(type) {
case *roachpb.ResolveIntentRangeRequest:
span := req.Span()
addLockTableSpan(spanset.SpanReadWrite, spanset.Span{Span: span})
case *roachpb.EndTxnRequest:
// EndTxnRequest does local intent resolution. We don't know the
// spans up front so we just allow everything.
spans.AddNonMVCC(spanset.SpanReadWrite, roachpb.Span{
Key: keys.LockTableSingleKeyStart, EndKey: keys.LockTableSingleKeyEnd})
}
}

// During writes we may encounter a versioned value newer than the request
// timestamp, and may have to retry at a higher timestamp. This is still
// safe as we're only ever writing at timestamps higher than the timestamp
Expand Down
20 changes: 19 additions & 1 deletion pkg/kv/kvserver/store.go
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ import (
"context"
"fmt"
"math"
"math/rand"
"path/filepath"
"runtime"
"sort"
Expand Down Expand Up @@ -195,10 +196,14 @@ var ExportRequestsLimit = settings.RegisterIntSetting(

// TestStoreConfig has some fields initialized with values relevant in tests.
func TestStoreConfig(clock *hlc.Clock) StoreConfig {
return testStoreConfig(clock, clusterversion.TestingBinaryVersion)
}

func testStoreConfig(clock *hlc.Clock, version roachpb.Version) StoreConfig {
if clock == nil {
clock = hlc.NewClock(hlc.UnixNano, time.Nanosecond)
}
st := cluster.MakeTestingClusterSettings()
st := cluster.MakeTestingClusterSettingsWithVersions(version, version, true)
sc := StoreConfig{
DefaultZoneConfig: zonepb.DefaultZoneConfigRef(),
DefaultSystemZoneConfig: zonepb.DefaultSystemZoneConfigRef(),
Expand All @@ -221,6 +226,19 @@ func TestStoreConfig(clock *hlc.Clock) StoreConfig {
return sc
}

// TestStoreConfigWithRandomizedClusterSeparatedIntentsMigration randomizes
// the StoreConfig to be before or after completion of the separated intents
// migration.
func TestStoreConfigWithRandomizedClusterSeparatedIntentsMigration(clock *hlc.Clock) StoreConfig {
version := clusterversion.TestingBinaryVersion
if rand.Intn(2) == 0 {
// This is before SeparatedIntentsMigration, so we may have interleaved
// intents.
version = clusterversion.ByKey(clusterversion.SeparatedIntentsMigration - 1)
}
return testStoreConfig(clock, version)
}

func newRaftConfig(
strg raft.Storage, id uint64, appliedIndex uint64, storeCfg StoreConfig, logger raft.Logger,
) *raft.Config {
Expand Down
3 changes: 2 additions & 1 deletion pkg/server/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -522,7 +522,8 @@ func (cfg *Config) CreateEngines(ctx context.Context) (Engines, error) {
storage.CacheSize(cfg.CacheSize),
storage.MaxSize(sizeInBytes),
storage.EncryptionAtRest(spec.EncryptionOptions),
storage.Settings(cfg.Settings))
storage.Settings(cfg.Settings),
storage.SetSeparatedIntents(disableSeparatedIntents))
if err != nil {
return Engines{}, err
}
Expand Down
Loading

0 comments on commit f2ee987

Please sign in to comment.