Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

release-22.2: sql: output RU estimate for EXPLAIN ANALYZE on tenants #93179

Merged

Commits on Dec 12, 2022

  1. **sql: surface query request units consumed by network egress**

    This commit adds a top-level field to the output of `EXPLAIN ANALYZE`
    that shows the estimated number of RUs that would be consumed due to
    network egress to the client. The estimate is obtained by buffering
    each value from the query result in text format and then measuring the
    size of the buffer before resetting it. The result is used to get the
    RU consumption with the tenant cost config's `PGWireEgressCost` method.
    
    **sql: surface query request units consumed due to cpu usage**
    
    This commit adds the ability for clients to estimate the number of RUs
    consumed by a query due to CPU usage. This is accomplished by keeping a
    moving average of the CPU usage for the entire tenant process, then using
    that to obtain an estimate for what the CPU usage *would* be if the query
    wasn't running. This is then compared against the actual measured CPU usage
    during the query's execution to get the estimate. For local flows this is
    done at the `connExecutor` level; for remote flows this is handled by the
    last outbox on the node (which gathers and sends the flow's metadata).
    The resulting RU estimate is added to the existing estimate from network
    egress and displayed in the output of `EXPLAIN ANALYZE`.
    
    **sql: surface query request units consumed by IO**
    
    This commit adds tracking for request units consumed by IO operations
    for all execution operators that perform KV operations. The corresponding
    RU count is recorded in the span and later aggregated with the RU consumption
    due to network egress and CPU usage. The resulting query RU consumption
    estimate is visible in the output of `EXPLAIN ANALYZE`.
    
    **multitenantccl: add sanity testing for ru estimation**
    
    This commit adds a sanity test for the RU estimates produced by running
    queries with `EXPLAIN ANALYZE` on a tenant. The test runs each test query
    several times with `EXPLAIN ANALYZE`, then runs all test queries without
    `EXPLAIN ANALYZE` and compares the resulting actual RU measurement to the
    aggregated estimates. For now, this test is disabled during builds because
    it is flaky in the presence of background activity. For this reason it
    should only be used as a manual sanity test.
    
    Informs cockroachdb#74441
    
    Release note (sql change): Added an estimate for the number of request units
    consumed by a query to the output of `EXPLAIN ANALYZE` for tenant sessions.
    DrewKimball committed Dec 12, 2022
    Configuration menu
    Copy the full SHA
    fcd049d View commit details
    Browse the repository at this point in the history
  2. sql: add cluster setting to disable RU estimation

    This patch adds a cluster setting, `sql.tenant_ru_estimation.enabled`,
    which is used to determine whether tenants collect an RU estimate for
    queries run with `EXPLAIN ANALYZE`. This is an escape hatch so that the
    RU estimation logic can be more safely backported.
    
    Informs cockroachdb#74441
    
    Release note: None
    DrewKimball committed Dec 12, 2022
    Configuration menu
    Copy the full SHA
    8d8941e View commit details
    Browse the repository at this point in the history