We enable the following ability (the first half of cockroachdb#82896):
Pick a stmt fingerprint, declare a sampling probability which controls
when verbose tracing is enabled for it, and a latency threshold for
which a trace is persisted.
With a given stmt rate (say 1000/s) and a given percentile we're trying
to capture (say p99.9), we have 0.001R stmt/s in the 99.9th percentile
(1/s in our example). We should be able to set a sampling probability P
such that with high likelihood (>95%) we capture at least one trace over
the next S seconds. The sampling rate lets you control how the overhead
you’re introducing for those statements in aggregate, which if dialed up
higher lets you lower S. You might want to do such a thing for
infrequently executed statements. We do all this using the existing
statement diagnostics machinery. It looks roughly as follows
> SELECT crdb_internal.request_statement_bundle(
'INSERT INTO ...', -- fingerprint
0.01, -- sampling probability
'120ms'::INTERVAL, -- latency target
'15m'::INTERVAL -- request expiration
);
$ cockroach statement-diag list --insecure
No statement diagnostics bundles available.
Outstanding activation requests:
ID Statement Sampling probability Min latency
770367610894417921 INSERT INTO ... 0.0100 90ms
# wait for an eventual capture..
$ cockroach statement-diag list --insecure
Statement diagnostics bundles:
ID Collection time Statement
770367073624621057 2022-06-14 00:49:33 UTC INSERT INTO ...
$ cockroach statement-diag download 770367073624621057 --insecure
Bundle saved to "stmt-bundle-770367073624621057.zip"
$ unzip stmt-bundle-770367073624621057.zip stmt
$ cat stmt/trace.txt
...
0.846ms 0.017ms event:kv/kvserver/spanlatch/manager.go:532
[n1,s1,r76/1:/Table/10{7/1-8}] waiting to
acquire read latch /Table/107/1/41/7/0@0,0,
held by write latch
/Table/107/1/41/7/0@1655167773.362147000,0
98.776ms 97.930ms event:kv/kvserver/concurrency/concurrency_manager.go:301
[n1,s1,r76/1:/Table/10{7/1-8}] scanning
lock table for conflicting locks
We leave wiring this up to the UI and continuous capture (second half of
\cockroachdb#82896) to future PRs.
Release note: None