-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tracing: perf regression due to trace redactability #71694
Comments
70792: server: expect duplicate session cookies on HTTP requests r=catj-cockroach,knz a=dhartunian Previously, the golang HTTP library was used to retrieve a cookie with the specific name we use to store our authentication token. This API method retrieves the *first* matching cookie in the header and ignores the rest. This can cause issues with requests that contain multiple matching cookies with the same name (if other applications are potentially running on the same domain). This change adjusts the HTTP API we use to retrieve cookies (we now prefer to call `req.Cookies()` and iterate through the entire list ourselves, instead of relying on the API to give us the matching cookie since it only supports returning a single result. Resolves #70376. Release note (security update): Authenticated HTTP requests to nodes can now contain additional cookies with the same name as the one we use ("session"). Duplicates like this are permitted by the HTTP spec and our code will now attempt to parse all cookies with a matching name before giving up. This can resolve issues with customers who run other services on the same domain as their CRDB nodes. 71503: server: Tenant pods fan out to cancel SQL queries r=matthewtodd a=matthewtodd Partially addresses #70832. Previously, the tenant `/_status/cancel_query` endpoint was not available over HTTP and could not cancel a query not running locally. This change opens it up, so that we can call it from the cloud console, and makes sure that that we can cancel queries when a tenant has more than one SQL pod running. Release note (api change): The tenant `/_status/cancel_query` endpoint was made available over HTTP and augmented to fan out requests in a multi-pod world. 71677: row: pass in account instead of monitor for KV fetcher r=yuzefovich a=yuzefovich This commit switches the KV fetcher to expect a memory account rather than a memory monitor. It additionally creates a separate memory account for the KV fetcher when it is used by the cFetcher (previously, we were reusing the monitor from the allocator). Note that we don't need to update multiple places where we call methods on possibly nil memory account because all methods have been taught to be noops when the receiver is `nil`. Fixes: #55481. Release note: None 71698: roachtest: add kv95/enc=false/nodes=3/tracing r=[dhartunian,stevendanna] a=tbg We don't have a good grasp on what it means for performance to turn on tracing globally. A kv95 (read-mostly) workload is a good baseline here as we expect it to be among the most severely impacted. Touches #71694. Touches #58610. Release note: None Co-authored-by: David Hartunian <davidh@cockroachlabs.com> Co-authored-by: Matthew Todd <todd@cockroachlabs.com> Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com> Co-authored-by: Tobias Grieger <tobias.b.grieger@gmail.com>
After discussing with @knz and @andreimatei we've decided on the following two-part solution to this problem:
|
After discussing this issue with more folks, it seems that the original need for #71606 is not as strong as initially assumed. We do not want to keep a "wall" between tenants and KV when tracing and customers can expect to learn and interpret information from the KV layer to debug query execution on multi-tenant clusters. In that spirit, I will be simplifying the immediate work on the Separately, we will continue to pursue trace redaction on |
I imagine that the original reason we panicked about these recording coming from tenants was about RPCs that internally cross range and tenant boundaries. Like I'm thinking a scan that internally does a PushTxn on a system transaction, or even on a transaction from the same tenant but where we have to scan the meta ranges to figure out where the respective txn record is. Is all this a ruse? |
This reverts commit fc95ad8. This PR restores the ability for tenant traces to see unstructured recordings from the KV layer. Please see note in cockroachdb#71694: cockroachdb#71694 (comment) The need to enforce such strict redaction of traces for tenant queries is not necessary. Customer views into the KV layer will be necessary in order to debug query execution in multi-tenant settings. Redaction is certainly needed at this boundary but must be configured more carefully. Further PRs as necessitated by cockroachdb#58610 and cockroachdb#71694 will tackle these issues on master (where trace redaction is still available) and will be backported to the release branch as needed. This PR restores tracing into KV for tenants and unblocks the ability for serverless to use the release-21.2 branch. Release note (ops change): Tracing queries from tenants will now include unstructured log messages from the kv layer.
How viable is it to audit all |
We need redaction on that boundary for sure (if the proposal is to leak unredacted KV traces to tenants then I think we need to talk about this again, certainly our security team would want to sign off on that and I'd be surprised if they did). But we do not need to prevent redacted KV trace recordings from reaching the tenant. This isn't thought to be useful for users directly but it is important for our ability to support users, as we don't currently have infrastructure to capture these traces internally. |
Prior commits added the ability to set a flag on the Tracer to determine whether Span's should be marked redactable or not. This commit adds logic to selectively force redactability enabled on Span's for requests that come from a non-system tenant. The desired outcome is to preserve redactability for tenant traces while preserving a non-redactable code path for requests using the system tenant. This preserves existing functionality in production while keeping the current performance hit of redaction from propagating to non-tenant deployments. Resolves cockroachdb#71694 Release note: None
Prior commits added the ability to set a flag on the Tracer to determine whether Span's should be marked redactable or not. This commit adds logic to selectively force redactability enabled on Span's for requests that come from a non-system tenant. The desired outcome is to preserve redactability for tenant traces while preserving a non-redactable code path for requests using the system tenant. This preserves existing functionality in production while keeping the current performance hit of redaction from propagating to non-tenant deployments. Resolves cockroachdb#71694 Release note: None
Prior commits added the ability to set a flag on the Tracer to determine whether Span's should be marked redactable or not. This commit adds logic to selectively force redactability enabled on Span's for requests that come from a non-system tenant. The desired outcome is to preserve redactability for tenant traces while preserving a non-redactable code path for requests using the system tenant. This preserves existing functionality in production while keeping the current performance hit of redaction from propagating to non-tenant deployments. Resolves cockroachdb#71694 Release note: None
There’s been a lot of back and forth on how to address this ticket in light of Serverless needs on the Here’s a summary of conflicting needs:
What are we doing? The way forward is contained in #73117 which:
Consequences of the above PR:
Alternative paths considered
|
Prior commits added the ability to set a flag on the Tracer to determine whether Span's should be marked redactable or not. This commit adds logic to selectively force redactability enabled on Span's for requests that come from a non-system tenant. The desired outcome is to preserve redactability for tenant traces while preserving a non-redactable code path for requests using the system tenant. This preserves existing functionality in production while keeping the current performance hit of redaction from propagating to non-tenant deployments. Resolves cockroachdb#71694 Release note: None
Prior commits added the ability to set a flag on the Tracer to determine whether Span's should be marked redactable or not. This commit adds logic to selectively force redactability enabled on Span's for requests that come from a non-system tenant. The desired outcome is to preserve redactability for tenant traces while preserving a non-redactable code path for requests using the system tenant. This preserves existing functionality in production while keeping the current performance hit of redaction from propagating to non-tenant deployments. Resolves cockroachdb#71694 Release note: None
Prior commits added the ability to set a flag on the Tracer to determine whether Span's should be marked redactable or not. This commit adds logic to selectively force redactability enabled on Span's for requests that come from a non-system tenant. The desired outcome is to preserve redactability for tenant traces while preserving a non-redactable code path for requests using the system tenant. This preserves existing functionality in production while keeping the current performance hit of redaction from propagating to non-tenant deployments. Resolves cockroachdb#71694 Release note: None
Prior commits added the ability to set a flag on the Tracer to determine whether Span's should be marked redactable or not. This commit adds logic to selectively force redactability enabled on Spans for requests that come from a non-system tenant. The desired outcome is to preserve redactability for tenant traces while preserving a non-redactable code path for requests using the system tenant. This preserves existing functionality in production while keeping the current performance hit of fine grained redaction from propagating to non-tenant deployments. Resolves cockroachdb#71694 Release note: None
Prior commits added the ability to set a flag on the Tracer to determine whether Span's should be marked redactable or not. This commit adds logic to selectively force redactability enabled on Spans for requests that come from a non-system tenant. The desired outcome is to preserve redactability for tenant traces while preserving a non-redactable code path for requests using the system tenant. This preserves existing functionality in production while keeping the current performance hit of fine grained redaction from propagating to non-tenant deployments. Resolves cockroachdb#71694 Release note: None
Prior commits added the ability to set a flag on the Tracer to determine whether Span's should be marked redactable or not. This commit adds logic to selectively force redactability enabled on Spans for requests that come from a non-system tenant. The desired outcome is to preserve redactability for tenant traces while preserving a non-redactable code path for requests using the system tenant. This preserves existing functionality in production while keeping the current performance hit of fine grained redaction from propagating to non-tenant deployments. Benchmarking this against master results in the following improvements. Verbose redaction is enabled in the test via SET tracing = on. * verbose-redaction-always.txt refers to master @ a64af2c which has redaction turned on always * verbose-redaction-conditional-off.txt refers to the top of this PR @ 51a8c72 which implements conditional redaction and turns it off for the test We gain back a significant portion of the performance hit. ``` 10:39:06 ❯ benchstat verbose-redaction-always.txt verbose-redaction-conditional-off.txt name old time/op new time/op delta Tracing/1node/scan/trace=100%/redactable=on/verbose=on-8 890µs ± 8% 795µs ±10% -10.68% (p=0.000 n=10+10) Tracing/1node/insert/trace=100%/redactable=on/verbose=on-8 1.05ms ± 4% 0.95ms ± 1% -9.17% (p=0.000 n=9+8) Tracing/3node/scan/trace=100%/redactable=on/verbose=on-8 921µs ± 4% 875µs ± 9% -5.09% (p=0.006 n=9+9) Tracing/3node/insert/trace=100%/redactable=on/verbose=on-8 1.12ms ±21% 1.02ms ± 5% -9.35% (p=0.010 n=10+9) name old alloc/op new alloc/op delta Tracing/1node/scan/trace=100%/redactable=on/verbose=on-8 133kB ± 0% 135kB ± 0% +1.76% (p=0.000 n=9+9) Tracing/1node/insert/trace=100%/redactable=on/verbose=on-8 165kB ± 0% 164kB ± 0% -0.47% (p=0.000 n=10+9) Tracing/3node/scan/trace=100%/redactable=on/verbose=on-8 170kB ± 0% 170kB ± 1% ~ (p=0.796 n=10+10) Tracing/3node/insert/trace=100%/redactable=on/verbose=on-8 154kB ± 2% 155kB ± 1% ~ (p=0.796 n=10+10) name old allocs/op new allocs/op delta Tracing/1node/scan/trace=100%/redactable=on/verbose=on-8 2.00k ± 0% 1.76k ± 0% -12.02% (p=0.000 n=9+8) Tracing/1node/insert/trace=100%/redactable=on/verbose=on-8 2.36k ± 0% 2.01k ± 0% -14.64% (p=0.000 n=10+10) Tracing/3node/scan/trace=100%/redactable=on/verbose=on-8 1.92k ± 0% 1.81k ± 1% -5.69% (p=0.000 n=10+10) Tracing/3node/insert/trace=100%/redactable=on/verbose=on-8 1.68k ± 1% 1.56k ± 1% -7.45% (p=0.000 n=10+10) ``` Resolves cockroachdb#71694 Release note: None
Prior commits added the ability to set a flag on the Tracer to determine whether Span's should be marked redactable or not. This commit adds logic to selectively force redactability enabled on Spans for requests that come from a non-system tenant. The desired outcome is to preserve redactability for tenant traces while preserving a non-redactable code path for requests using the system tenant. This preserves existing functionality in production while keeping the current performance hit of fine grained redaction from propagating to non-tenant deployments. Benchmarking this against master results in the following improvements. Verbose redaction is enabled in the test via SET tracing = on. * verbose-redaction-always.txt refers to master @ a64af2c which has redaction turned on always * verbose-redaction-conditional-off.txt refers to the top of this PR @ 51a8c72 which implements conditional redaction and turns it off for the test We gain back a significant portion of the performance hit. ``` 10:39:06 ❯ benchstat verbose-redaction-always.txt verbose-redaction-conditional-off.txt name old time/op new time/op delta Tracing/1node/scan/trace=100%/redactable=on/verbose=on-8 890µs ± 8% 795µs ±10% -10.68% (p=0.000 n=10+10) Tracing/1node/insert/trace=100%/redactable=on/verbose=on-8 1.05ms ± 4% 0.95ms ± 1% -9.17% (p=0.000 n=9+8) Tracing/3node/scan/trace=100%/redactable=on/verbose=on-8 921µs ± 4% 875µs ± 9% -5.09% (p=0.006 n=9+9) Tracing/3node/insert/trace=100%/redactable=on/verbose=on-8 1.12ms ±21% 1.02ms ± 5% -9.35% (p=0.010 n=10+9) name old alloc/op new alloc/op delta Tracing/1node/scan/trace=100%/redactable=on/verbose=on-8 133kB ± 0% 135kB ± 0% +1.76% (p=0.000 n=9+9) Tracing/1node/insert/trace=100%/redactable=on/verbose=on-8 165kB ± 0% 164kB ± 0% -0.47% (p=0.000 n=10+9) Tracing/3node/scan/trace=100%/redactable=on/verbose=on-8 170kB ± 0% 170kB ± 1% ~ (p=0.796 n=10+10) Tracing/3node/insert/trace=100%/redactable=on/verbose=on-8 154kB ± 2% 155kB ± 1% ~ (p=0.796 n=10+10) name old allocs/op new allocs/op delta Tracing/1node/scan/trace=100%/redactable=on/verbose=on-8 2.00k ± 0% 1.76k ± 0% -12.02% (p=0.000 n=9+8) Tracing/1node/insert/trace=100%/redactable=on/verbose=on-8 2.36k ± 0% 2.01k ± 0% -14.64% (p=0.000 n=10+10) Tracing/3node/scan/trace=100%/redactable=on/verbose=on-8 1.92k ± 0% 1.81k ± 1% -5.69% (p=0.000 n=10+10) Tracing/3node/insert/trace=100%/redactable=on/verbose=on-8 1.68k ± 1% 1.56k ± 1% -7.45% (p=0.000 n=10+10) ``` Resolves cockroachdb#71694 Release note: None
Prior commits added the ability to set a flag on the Tracer to determine whether Span's should be marked redactable or not. This commit adds a cluster setting that will enable fine-grained trace redactability. The intended use is for multi-tenant deployments where KV traces should be redacted prior to being returned to tenants. Enabling the flag will enable fine-grained redaction so that tenant operators can continue to debug their queries via redacted kv trace information instead of having that information stripped out. The desired outcome is to preserve redactability for tenant traces while preserving a non-redactable code path for requests using the system tenant. This preserves existing functionality in production while keeping the current performance hit of fine grained redaction from propagating to non-tenant deployments. Benchmarking this against master results in the following improvements. Verbose redaction is enabled in the test via SET tracing = on. * verbose-redaction-always.txt refers to master @ a64af2c which has redaction turned on always * verbose-redaction-conditional-off.txt refers to the top of this PR @ 51a8c72 which implements conditional redaction and turns it off for the test We gain back a significant portion of the performance hit. ``` 10:39:06 ❯ benchstat verbose-redaction-always.txt verbose-redaction-conditional-off.txt name old time/op new time/op delta Tracing/1node/scan/trace=100%/redactable=on/verbose=on-8 890µs ± 8% 795µs ±10% -10.68% (p=0.000 n=10+10) Tracing/1node/insert/trace=100%/redactable=on/verbose=on-8 1.05ms ± 4% 0.95ms ± 1% -9.17% (p=0.000 n=9+8) Tracing/3node/scan/trace=100%/redactable=on/verbose=on-8 921µs ± 4% 875µs ± 9% -5.09% (p=0.006 n=9+9) Tracing/3node/insert/trace=100%/redactable=on/verbose=on-8 1.12ms ±21% 1.02ms ± 5% -9.35% (p=0.010 n=10+9) name old alloc/op new alloc/op delta Tracing/1node/scan/trace=100%/redactable=on/verbose=on-8 133kB ± 0% 135kB ± 0% +1.76% (p=0.000 n=9+9) Tracing/1node/insert/trace=100%/redactable=on/verbose=on-8 165kB ± 0% 164kB ± 0% -0.47% (p=0.000 n=10+9) Tracing/3node/scan/trace=100%/redactable=on/verbose=on-8 170kB ± 0% 170kB ± 1% ~ (p=0.796 n=10+10) Tracing/3node/insert/trace=100%/redactable=on/verbose=on-8 154kB ± 2% 155kB ± 1% ~ (p=0.796 n=10+10) name old allocs/op new allocs/op delta Tracing/1node/scan/trace=100%/redactable=on/verbose=on-8 2.00k ± 0% 1.76k ± 0% -12.02% (p=0.000 n=9+8) Tracing/1node/insert/trace=100%/redactable=on/verbose=on-8 2.36k ± 0% 2.01k ± 0% -14.64% (p=0.000 n=10+10) Tracing/3node/scan/trace=100%/redactable=on/verbose=on-8 1.92k ± 0% 1.81k ± 1% -5.69% (p=0.000 n=10+10) Tracing/3node/insert/trace=100%/redactable=on/verbose=on-8 1.68k ± 1% 1.56k ± 1% -7.45% (p=0.000 n=10+10) ``` Resolves cockroachdb#71694 Release note: None
73405: server, tracing: Make traces redactable and default to "off" r=andreimatei a=dhartunian Multi-tenant deployments require tracing support in order to help customers and engineers debug query performance issues. This requires that trace information recorded on the KV layer be redacted prior to being transmitted to the tenant in order to guard against data leakage through the multi-tenant abstractions. This change makes tracing redactability configurable and adds a cluster setting to enable fine-grained redactability that we can use for multi-tenant deployments. The desired outcome is to preserve redactability for tenant traces while preserving a non-redactable code path for requests using the system tenant. This preserves existing functionality in production while keeping the current performance hit of redaction from propagating to non-tenant deployments. This PR brings `master` in parity with changes in `release-21.2` via #73117. Benchmarking this against master results in the following improvements. old.txt is using `master` @ 65920cf new.txt is using this commit @ 7c768ac Benchmark was run using `gceworker`. We gain back a significant portion of the performance hit: ``` david@davidh:~/go/src/github.com/cockroachdb/cockroach$ benchstat old.txt new.txt name old time/op new time/op delta Tracing/1node/scan/trace=100%/threshold=1ms-24 1.20ms ± 4% 1.15ms ± 2% -4.83% (p=0.000 n=10+9) Tracing/1node/insert/trace=100%/threshold=1ms-24 1.51ms ± 3% 1.40ms ± 2% -6.71% (p=0.000 n=10+10) Tracing/3node/scan/trace=100%/threshold=1ms-24 2.17ms ± 3% 2.10ms ± 4% -3.26% (p=0.003 n=10+10) Tracing/3node/insert/trace=100%/threshold=1ms-24 2.41ms ± 3% 2.30ms ± 3% -4.45% (p=0.000 n=8+9) name old alloc/op new alloc/op delta Tracing/1node/scan/trace=100%/threshold=1ms-24 115kB ± 1% 120kB ± 1% +4.39% (p=0.000 n=10+10) Tracing/1node/insert/trace=100%/threshold=1ms-24 149kB ± 1% 152kB ± 0% +1.92% (p=0.000 n=10+9) Tracing/3node/scan/trace=100%/threshold=1ms-24 276kB ± 3% 282kB ± 3% +2.18% (p=0.035 n=10+10) Tracing/3node/insert/trace=100%/threshold=1ms-24 265kB ± 1% 264kB ± 2% ~ (p=0.963 n=8+9) name old allocs/op new allocs/op delta Tracing/1node/scan/trace=100%/threshold=1ms-24 1.68k ± 0% 1.47k ± 0% -12.16% (p=0.000 n=9+9) Tracing/1node/insert/trace=100%/threshold=1ms-24 2.05k ± 0% 1.76k ± 0% -14.46% (p=0.000 n=10+8) Tracing/3node/scan/trace=100%/threshold=1ms-24 3.10k ± 0% 2.85k ± 0% -8.21% (p=0.000 n=10+10) Tracing/3node/insert/trace=100%/threshold=1ms-24 3.00k ± 0% 2.66k ± 0% -11.30% (p=0.000 n=9+10) ``` Resolves #71694 Release note: None Resolves #71694 Release note: None Co-authored-by: Tobias Grieger <tobias.b.grieger@gmail.com> Co-authored-by: David Hartunian <davidh@cockroachlabs.com>
Prior commits added the ability to set a flag on the Tracer to determine whether Span's should be marked redactable or not. This commit adds a cluster setting that will enable fine-grained trace redactability. The intended use is for multi-tenant deployments where KV traces should be redacted prior to being returned to tenants. Enabling the flag will enable fine-grained redaction so that tenant operators can continue to debug their queries via redacted kv trace information instead of having that information stripped out. The desired outcome is to preserve redactability for tenant traces while preserving a non-redactable code path for requests using the system tenant. This preserves existing functionality in production while keeping the current performance hit of fine grained redaction from propagating to non-tenant deployments. Benchmarking this against master results in the following improvements. old.txt is using `master` @ 65920cf new.txt is using this commit @ 7c768ac Benchmark was run using `gceworker`. We gain back a significant portion of the performance hit: ``` david@davidh:~/go/src/github.com/cockroachdb/cockroach$ benchstat old.txt new.txt name old time/op new time/op delta Tracing/1node/scan/trace=100%/threshold=1ms-24 1.20ms ± 4% 1.15ms ± 2% -4.83% (p=0.000 n=10+9) Tracing/1node/insert/trace=100%/threshold=1ms-24 1.51ms ± 3% 1.40ms ± 2% -6.71% (p=0.000 n=10+10) Tracing/3node/scan/trace=100%/threshold=1ms-24 2.17ms ± 3% 2.10ms ± 4% -3.26% (p=0.003 n=10+10) Tracing/3node/insert/trace=100%/threshold=1ms-24 2.41ms ± 3% 2.30ms ± 3% -4.45% (p=0.000 n=8+9) name old alloc/op new alloc/op delta Tracing/1node/scan/trace=100%/threshold=1ms-24 115kB ± 1% 120kB ± 1% +4.39% (p=0.000 n=10+10) Tracing/1node/insert/trace=100%/threshold=1ms-24 149kB ± 1% 152kB ± 0% +1.92% (p=0.000 n=10+9) Tracing/3node/scan/trace=100%/threshold=1ms-24 276kB ± 3% 282kB ± 3% +2.18% (p=0.035 n=10+10) Tracing/3node/insert/trace=100%/threshold=1ms-24 265kB ± 1% 264kB ± 2% ~ (p=0.963 n=8+9) name old allocs/op new allocs/op delta Tracing/1node/scan/trace=100%/threshold=1ms-24 1.68k ± 0% 1.47k ± 0% -12.16% (p=0.000 n=9+9) Tracing/1node/insert/trace=100%/threshold=1ms-24 2.05k ± 0% 1.76k ± 0% -14.46% (p=0.000 n=10+8) Tracing/3node/scan/trace=100%/threshold=1ms-24 3.10k ± 0% 2.85k ± 0% -8.21% (p=0.000 n=10+10) Tracing/3node/insert/trace=100%/threshold=1ms-24 3.00k ± 0% 2.66k ± 0% -11.30% (p=0.000 n=9+10) ``` Resolves cockroachdb#71694 Release note: None
Prior commits added the ability to set a flag on the Tracer to determine whether Span's should be marked redactable or not. This commit adds a cluster setting that will enable fine-grained trace redactability. The intended use is for multi-tenant deployments where KV traces should be redacted prior to being returned to tenants. Enabling the flag will enable fine-grained redaction so that tenant operators can continue to debug their queries via redacted kv trace information instead of having that information stripped out. The desired outcome is to preserve redactability for tenant traces while preserving a non-redactable code path for requests using the system tenant. This preserves existing functionality in production while keeping the current performance hit of fine grained redaction from propagating to non-tenant deployments. Benchmarking this against master results in the following improvements. old.txt is using `master` @ 65920cf new.txt is using this commit @ 7c768ac Benchmark was run using `gceworker`. We gain back a significant portion of the performance hit: ``` david@davidh:~/go/src/github.com/cockroachdb/cockroach$ benchstat old.txt new.txt name old time/op new time/op delta Tracing/1node/scan/trace=100%/threshold=1ms-24 1.20ms ± 4% 1.15ms ± 2% -4.83% (p=0.000 n=10+9) Tracing/1node/insert/trace=100%/threshold=1ms-24 1.51ms ± 3% 1.40ms ± 2% -6.71% (p=0.000 n=10+10) Tracing/3node/scan/trace=100%/threshold=1ms-24 2.17ms ± 3% 2.10ms ± 4% -3.26% (p=0.003 n=10+10) Tracing/3node/insert/trace=100%/threshold=1ms-24 2.41ms ± 3% 2.30ms ± 3% -4.45% (p=0.000 n=8+9) name old alloc/op new alloc/op delta Tracing/1node/scan/trace=100%/threshold=1ms-24 115kB ± 1% 120kB ± 1% +4.39% (p=0.000 n=10+10) Tracing/1node/insert/trace=100%/threshold=1ms-24 149kB ± 1% 152kB ± 0% +1.92% (p=0.000 n=10+9) Tracing/3node/scan/trace=100%/threshold=1ms-24 276kB ± 3% 282kB ± 3% +2.18% (p=0.035 n=10+10) Tracing/3node/insert/trace=100%/threshold=1ms-24 265kB ± 1% 264kB ± 2% ~ (p=0.963 n=8+9) name old allocs/op new allocs/op delta Tracing/1node/scan/trace=100%/threshold=1ms-24 1.68k ± 0% 1.47k ± 0% -12.16% (p=0.000 n=9+9) Tracing/1node/insert/trace=100%/threshold=1ms-24 2.05k ± 0% 1.76k ± 0% -14.46% (p=0.000 n=10+8) Tracing/3node/scan/trace=100%/threshold=1ms-24 3.10k ± 0% 2.85k ± 0% -8.21% (p=0.000 n=10+10) Tracing/3node/insert/trace=100%/threshold=1ms-24 3.00k ± 0% 2.66k ± 0% -11.30% (p=0.000 n=9+10) ``` Resolves cockroachdb#71694 Release note: None
Describe the problem
In #58610 we discussed trace data sent to tenants from the KV layer. This data
was not redactable and there was a concern that tenants could learn privileged
information (such as keys belonging to another tenant, for example).
We reacted to this by adding redactability for traces (#70562). Unfortunately,
this introduced inacceptably high overhead in the case in which tracing is
enabled across the board (as will be the case with the
sql.trace.*.enable_threshold
cluster setting, which is known to see someusage) and, on
release-21.2
, we switched to a blunter approach with degradedobservability, #71606. This degraded observability is tracked in #58610 and is
out of scope here as it does not affect master (at the time of writing).
Details on the regression can be found in
#71610. Concretely, in a simple
"select single row" end-to-end local benchmark, we see approximately a 20%
regression in average request latency.
We have to either mitigate the regression or revert the work but are then
forced to add the redactability again in some way, as for better or worse
traces as observed by tenants are an invaluable tool in CockroachDB support
today.
Epic CRDB-2574
The text was updated successfully, but these errors were encountered: