Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: output RU estimate for EXPLAIN ANALYZE on tenants #89256

Merged
merged 1 commit into from
Nov 18, 2022

Conversation

DrewKimball
Copy link
Collaborator

@DrewKimball DrewKimball commented Oct 4, 2022

sql: surface query request units consumed by network egress
This commit adds a top-level field to the output of EXPLAIN ANALYZE
that shows the estimated number of RUs that would be consumed due to
network egress to the client. The estimate is obtained by measuring
the in-memory size of the query result, and passing that to the
tenant cost config's PGWireEgressCost method.

sql: surface query request units consumed due to cpu usage
This commit adds the ability for clients to estimate the number of RUs
consumed by a query due to CPU usage. This is accomplished by keeping a
moving average of the CPU usage for the entire tenant process, then using
that to obtain an estimate for what the CPU usage would be if the query
wasn't running. This is then compared against the actual measured CPU usage
during the query's execution to get the estimate. For local flows this is
done at the connExecutor level; for remote flows this is handled by the
last outbox on the node (which gathers and sends the flow's metadata).
The resulting RU estimate is added to the existing estimate from network
egress and displayed in the output of EXPLAIN ANALYZE.

sql: surface query request units consumed by IO
This commit adds tracking for request units consumed by IO operations
for all execution operators that perform KV operations. The corresponding
RU count is recorded in the span and later aggregated with the RU consumption
due to network egress and CPU usage. The resulting query RU consumption
estimate is visible in the output of EXPLAIN ANALYZE.

multitenantccl: add sanity testing for ru estimation
This commit adds a sanity test for the RU estimates produced by running
queries with EXPLAIN ANALYZE on a tenant. The test runs each test query
several times with EXPLAIN ANALYZE, then runs all test queries without
EXPLAIN ANALYZE and compares the resulting actual RU measurement to the
aggregated estimates.

Informs #74441

Release note (sql change): Added an estimate for the number of request units
consumed by a query to the output of EXPLAIN ANALYZE for tenant sessions.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@DrewKimball DrewKimball force-pushed the estimate-ru branch 5 times, most recently from 0bc21b1 to 238ae9d Compare October 18, 2022 04:58
@DrewKimball DrewKimball force-pushed the estimate-ru branch 2 times, most recently from c4cad32 to 29969e8 Compare October 25, 2022 08:18
@DrewKimball DrewKimball changed the title [wip] output RU estimate for IO and network egress sql: output RU estimate for EXPLAIN ANALYZE on tenants Oct 25, 2022
@DrewKimball DrewKimball marked this pull request as ready for review October 25, 2022 08:25
@DrewKimball DrewKimball requested a review from a team as a code owner October 25, 2022 08:25
@DrewKimball DrewKimball requested a review from a team October 25, 2022 08:25
@DrewKimball DrewKimball requested review from a team as code owners October 25, 2022 08:25
@DrewKimball DrewKimball requested a review from a team October 25, 2022 08:25
@DrewKimball DrewKimball requested a review from a team as a code owner October 25, 2022 08:25
@DrewKimball
Copy link
Collaborator Author

TODO: squash commits before merging.

@dhartunian dhartunian removed the request for review from a team October 25, 2022 17:20
@DrewKimball DrewKimball removed request for a team October 25, 2022 22:24
@DrewKimball DrewKimball force-pushed the estimate-ru branch 2 times, most recently from 22c2ab2 to 26b0658 Compare October 27, 2022 09:40
@DrewKimball DrewKimball requested review from a team and removed request for a team October 27, 2022 09:40
@DrewKimball

This comment was marked as resolved.

@DrewKimball
Copy link
Collaborator Author

Alright, the only test failures I see now are this:

    lint_test.go:91: 
        pkg/server/stop_trigger.go:46:3: github.com/cockroachdb/cockroach/pkg/util/log.Infof format %s has arg r of wrong type github.com/cockroachdb/cockroach/pkg/server.ShutdownRequest
    --- FAIL: TestLint/TestRoachVet (144.88s)

and this:

[17:46:43][Run unit tests] FAIL: //pkg/kv/kvclient/kvcoord:kvcoord_disallowed_imports_test (see /home/roach/.cache/bazel/_bazel_roach/c5a4e7d36696d9cd970af2045211a7df/execroot/com_github_cockroachdb_cockroach/bazel-out/k8-fastbuild/testlogs/pkg/kv/kvclient/kvcoord/kvcoord_disallowed_imports_test/test.log)
[17:46:43][Run unit tests] INFO: From Testing //pkg/kv/kvclient/kvcoord:kvcoord_disallowed_imports_test:
[17:46:43][Run unit tests] ==================== Test output for //pkg/kv/kvclient/kvcoord:kvcoord_disallowed_imports_test:
[17:46:43][Run unit tests] ERROR: //pkg/kv/kvclient/kvcoord:kvcoord imports //pkg/storage:storage
[17:46:43][Run unit tests] 	check: bazel query 'somepath(//pkg/kv/kvclient/kvcoord:kvcoord, //pkg/storage:storage)'
[17:46:43][Run unit tests] ================================================================================

both of which seem unrelated.

@DrewKimball DrewKimball force-pushed the estimate-ru branch 2 times, most recently from 071137b to 5c788b9 Compare October 28, 2022 20:07
Copy link
Member

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work!

Reviewed 1 of 13 files at r1, 1 of 16 files at r2, 5 of 27 files at r6, 24 of 24 files at r13, 18 of 18 files at r14, 16 of 16 files at r15, 3 of 3 files at r16, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andy-kimball and @DrewKimball)


pkg/ccl/multitenantccl/tenantcostclient/tenant_side.go line 878 at r14 (raw file):

	c.mu.Lock()
	cpuAvg = c.mu.avgCPUPerSec
	c.mu.Unlock()

nit: I believe in one of the recent Go releases the price of the defer has gone done significantly, so we switched to mostly Unlocking in a defer in situations like this, even in performance sensitive code. But up to you.


pkg/multitenant/multitenantcpu/cpu_usage.go line 9 at r14 (raw file):

//     https://github.com/cockroachdb/cockroach/blob/master/licenses/CCL.txt

package multitenantcpu

Should this be somewhere in pkg/ccl subtree? Alternatively, should this use BSL license (similar to multitenantio)?


pkg/multitenant/multitenantcpu/cpu_usage.go line 63 at r14 (raw file):

// It is used for measuring tenant RU consumption.
func GetCPUSeconds(ctx context.Context) (cpuSecs float64) {
	userTimeMillis, sysTimeMillis, err := status.GetCPUTime(ctx)

GetCPUTime doesn't actually use ctx argument. We could remove it which will reduce the amount of plumbing needed here.


pkg/sql/distsql_running.go line 617 at r13 (raw file):

			// Only collect the network egress estimate for a tenant that is running
			// EXPLAIN ANALYZE, since the overhead is non-negligible.
			recv.isTenantExplainAnalyze = instrumentation.collectExecStats &&

nit: I think the second part of the AND is sufficient, no?


pkg/sql/colflow/stats.go line 234 at r15 (raw file):

		execstats.PopulateKVMVCCStats(&s.KV, &scanStats)
		if vsc.isTenant {
			// Only set the RUs consumed if this is a tenant.

nit: would it be ok to set it for single tenant deployments too (since RuConsumed would be zero)? This would then be ignored in instrumentation.go.


pkg/sql/colflow/vectorized_flow.go line 506 at r14 (raw file):

					MaxMemUsage:  optional.MakeUint(uint64(flowCtx.Mon.MaximumBytes())),
					MaxDiskUsage: optional.MakeUint(uint64(flowCtx.DiskMonitor.MaximumBytes())),
					ConsumedRU:   optional.MakeUint(uint64(cpuStatsCollector.EndCollection(ctx))),

We also need to make a similar change to flowinfra/outbox.go for the row-based flows.


pkg/sql/colflow/vectorized_flow.go line 750 at r14 (raw file):

	stream *execinfrapb.StreamEndpointSpec,
	factory coldata.ColumnFactory,
	getStats func(ctx2 context.Context) []*execinfrapb.ComponentStats,

nit: can omit the variable name altogether.


pkg/sql/execstats/stats.go line 77 at r15 (raw file):

	// RuConsumed is the number of RUs that were consumed during the course of a
	// scan.
	RuConsumed uint64

nit: maybe s/RuConsumed/ConsumedRU/ for symmetry with the proto?


pkg/ccl/multitenantccl/tenantcostclient/query_ru_estimate_test.go line 67 at r16 (raw file):

	}

	params.DisableDefaultTestTenant = true

nit: this is already set above.

Copy link
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice! I'm not so familiar with this code so someone else should approve, but I just had a couple of small questions.

Reviewed 1 of 13 files at r1, 1 of 16 files at r2, 5 of 27 files at r6, 24 of 24 files at r13, 18 of 18 files at r14, 16 of 16 files at r15, 3 of 3 files at r16, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andy-kimball, @DrewKimball, and @yuzefovich)


pkg/multitenant/multitenantcpu/cpu_usage.go line 52 at r16 (raw file):

	cpuDelta := GetCPUSeconds(ctx) - h.startCPU
	timeElapsed := timeutil.Since(h.startTime)
	expectedCPUDelta := timeElapsed.Seconds() * h.avgCPUPerSecond

I guess this won't work well when there are many concurrent queries?


pkg/sql/rowexec/inverted_joiner.go line 782 at r15 (raw file):

	}
	if ij.FlowCtx.Cfg.TenantCostController != nil {
		// Don't populate the consumed RU unless this is a tenant.

Would it be bad to collect this for non-tenants? I know someone was asking if it would be possible to surface RU estimates even for non-tenants so users could know what their workload might cost on CRDB Serverless. Then you could also avoid a lot of plumbing.

(maybe the same question @yuzefovich had)


pkg/ccl/multitenantccl/tenantcostclient/query_ru_estimate_test.go line 43 at r16 (raw file):

// consumption is within reasonable distance from the actual measured RUs for
// the tenant.
func TestEstimateQueryRUConsumption(t *testing.T) {

Nice test!


pkg/ccl/multitenantccl/tenantcostclient/query_ru_estimate_test.go line 136 at r16 (raw file):

				if strings.Contains(val, "estimated RUs consumed") {
					substr := strings.Split(val, " ")
					if len(substr) == 4 {

if this is false should you return an error?

@DrewKimball DrewKimball force-pushed the estimate-ru branch 2 times, most recently from d1ba637 to 783dcb7 Compare November 8, 2022 23:23
Copy link
Collaborator Author

@DrewKimball DrewKimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andy-kimball, @rytaft, and @yuzefovich)


pkg/ccl/multitenantccl/tenantcostclient/tenant_side.go line 796 at r1 (raw file):

Previously, andy-kimball (Andy Kimball) wrote…

This should go above the lock, since we want to release the lock as quickly as possible.

Done.


pkg/ccl/multitenantccl/tenantcostclient/tenant_side.go line 878 at r14 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: I believe in one of the recent Go releases the price of the defer has gone done significantly, so we switched to mostly Unlocking in a defer in situations like this, even in performance sensitive code. But up to you.

Done. That's good to know


pkg/multitenant/multitenantcpu/cpu_usage.go line 9 at r14 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

Should this be somewhere in pkg/ccl subtree? Alternatively, should this use BSL license (similar to multitenantio)?

Oops, just copied that license blindly. I'm not really sure of where things are supposed to be located WRT multitenant vs multitenantccl but for now I've just changed the license. Done.


pkg/multitenant/multitenantcpu/cpu_usage.go line 63 at r14 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

GetCPUTime doesn't actually use ctx argument. We could remove it which will reduce the amount of plumbing needed here.

That's a good point. How would you handle the logging here without the context though? Returning the error would just run into the same problem further up the stack. I guess an option would be to just return 0 on error, but that seems problematic.


pkg/multitenant/multitenantcpu/cpu_usage.go line 52 at r16 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

I guess this won't work well when there are many concurrent queries?

Yes, I'll add a comment. Done.


pkg/sql/distsql_running.go line 617 at r13 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: I think the second part of the AND is sufficient, no?

Done.


pkg/sql/colflow/stats.go line 234 at r15 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: would it be ok to set it for single tenant deployments too (since RuConsumed would be zero)? This would then be ignored in instrumentation.go.

Good point, that cleans up some plumbing I had to do. Done.


pkg/sql/colflow/vectorized_flow.go line 506 at r14 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

We also need to make a similar change to flowinfra/outbox.go for the row-based flows.

Done. I've moved the cpu usage collector for remote flows to the FlowCtx since I didn't see a good way to propagate it to the row-based outbox otherwise.


pkg/sql/colflow/vectorized_flow.go line 750 at r14 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: can omit the variable name altogether.

Done.


pkg/sql/execstats/stats.go line 77 at r15 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: maybe s/RuConsumed/ConsumedRU/ for symmetry with the proto?

Done.


pkg/sql/rowexec/inverted_joiner.go line 782 at r15 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

Would it be bad to collect this for non-tenants? I know someone was asking if it would be possible to surface RU estimates even for non-tenants so users could know what their workload might cost on CRDB Serverless. Then you could also avoid a lot of plumbing.

(maybe the same question @yuzefovich had)

Good point, Done.


pkg/ccl/multitenantccl/tenantcostclient/query_ru_estimate_test.go line 43 at r16 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

Nice test!

Thanks! Took some doing to figure out how to make it run.


pkg/ccl/multitenantccl/tenantcostclient/query_ru_estimate_test.go line 67 at r16 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: this is already set above.

Done.


pkg/ccl/multitenantccl/tenantcostclient/query_ru_estimate_test.go line 136 at r16 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

if this is false should you return an error?

Done.

Copy link
Member

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 1 of 16 files at r2, 31 of 31 files at r17, 21 of 21 files at r18, 14 of 14 files at r19, 5 of 5 files at r20, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @andy-kimball, @DrewKimball, and @rytaft)


pkg/ccl/multitenantccl/tenantcostclient/tenant_side.go line 878 at r18 (raw file):

	c.mu.Lock()
	defer c.mu.Unlock()
	cpuAvg = c.mu.avgCPUPerSec

nit: could remove the named return argument and just return the mutex-protected thing.


pkg/multitenant/multitenantcpu/cpu_usage.go line 63 at r14 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

That's a good point. How would you handle the logging here without the context though? Returning the error would just run into the same problem further up the stack. I guess an option would be to just return 0 on error, but that seems problematic.

Right, I didn't realize that we also use ctx in the current function, nvm.


pkg/sql/conn_executor.go line 16 at r18 (raw file):

	"context"
	"fmt"
	"github.com/cockroachdb/cockroach/pkg/multitenant/multitenantcpu"

nit: the ordering is off in the second commit.


pkg/sql/execinfra/flow_context.go line 17 at r18 (raw file):

import (
	"context"
	"github.com/cockroachdb/cockroach/pkg/multitenant/multitenantcpu"

nit: ditto for the second commit.

Copy link
Contributor

@andy-kimball andy-kimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @DrewKimball and @rytaft)


pkg/ccl/multitenantccl/tenantcostclient/tenant_side.go line 406 at r12 (raw file):

		// Keep track of an exponential moving average of CPU usage.
		avg := deltaCPU / deltaTime.Seconds()
		c.mu.Lock()

I'd move this code down into the lock that we're already taking, so that we don't have to acquire/release twice.


pkg/ccl/multitenantccl/tenantcostclient/tenant_side.go line 871 at r12 (raw file):

	}

	c.mu.Lock()

Don't we need to add RU here as well? Could it show up in an EXPLAIN ANALYZE of an EXPORT or some other statement?


pkg/multitenant/multitenantcpu/cpu_usage.go line 60 at r20 (raw file):

	expectedCPUDelta := timeElapsed.Seconds() * h.avgCPUPerSecond
	cpuUsageSeconds := cpuDelta - expectedCPUDelta
	if cpuUsageSeconds < 0 {

I think we should also bound the consumed CPU on the upper side by checking if cpuUsageSeconds exceeds timeElapsed.Seconds(). I don't believe that we do much parallel CPU work for a single query, and so if it's greater, it's almost always going to be due to other queries interfering with the CPU measurement. I'd rather bound it to reduce that possibility of gross error.


pkg/sql/rowexec/inverted_joiner.go line 782 at r15 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

Good point, Done.

Note that we don't calculate RU for non-tenants today, so ConsumedRU should be zero in those cases.

Copy link
Collaborator Author

@DrewKimball DrewKimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @andy-kimball, @rytaft, and @yuzefovich)


pkg/ccl/multitenantccl/tenantcostclient/tenant_side.go line 406 at r12 (raw file):

Previously, andy-kimball (Andy Kimball) wrote…

I'd move this code down into the lock that we're already taking, so that we don't have to acquire/release twice.

Done.


pkg/ccl/multitenantccl/tenantcostclient/tenant_side.go line 871 at r12 (raw file):

Previously, andy-kimball (Andy Kimball) wrote…

Don't we need to add RU here as well? Could it show up in an EXPLAIN ANALYZE of an EXPORT or some other statement?

I added a TODO for that above, but you make a good point that we probably want to handle such cases now. It's not obvious to me how to propagate this information for a query. Do you think it would be ok to just disable RU consumption estimation for "non-simple" statements?


pkg/ccl/multitenantccl/tenantcostclient/tenant_side.go line 878 at r18 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: could remove the named return argument and just return the mutex-protected thing.

Done.


pkg/multitenant/multitenantcpu/cpu_usage.go line 63 at r14 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

Right, I didn't realize that we also use ctx in the current function, nvm.

Done.


pkg/multitenant/multitenantcpu/cpu_usage.go line 60 at r20 (raw file):

Previously, andy-kimball (Andy Kimball) wrote…

I think we should also bound the consumed CPU on the upper side by checking if cpuUsageSeconds exceeds timeElapsed.Seconds(). I don't believe that we do much parallel CPU work for a single query, and so if it's greater, it's almost always going to be due to other queries interfering with the CPU measurement. I'd rather bound it to reduce that possibility of gross error.

Done.


pkg/sql/conn_executor.go line 16 at r18 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: the ordering is off in the second commit.

Ah, forgot I haven't set up crlfmt for goland on this laptop yet. Done.


pkg/sql/execinfra/flow_context.go line 17 at r18 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: ditto for the second commit.

Done.

Copy link
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 31 of 31 files at r17, 12 of 21 files at r18, 16 of 16 files at r21, 10 of 11 files at r22, 2 of 2 files at r23, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @andy-kimball, @DrewKimball, and @yuzefovich)

@DrewKimball
Copy link
Collaborator Author

Thanks for the reviews so far! I think the network RU estimation is broken for the row-wise path, since the types don't get set (the batch route directly uses the types from the batch). I'm also going to look into how complicated it'll be to account for external IOs as well, since it's not obvious how to prevent RU estimation for queries that perform external IO.

@DrewKimball
Copy link
Collaborator Author

Alright, I made a few changes that should hopefully tie up the last few loose ends for this PR.

  1. The network RU consumption estimation is no longer handled by the query's DistSQLReceiver; instead there is a pair of functions that hook into pgwire code that buffers the data, then measures its size and discards it.
  2. The planNodeToRowSource processor now propagates RU consumption for IOs. This allows RUs to be measured for mutations, e.g. inserts and deletes.
  3. The sanity test is now skipped to ensure it is only run manually. This is necessary because the test is very vulnerable to background activity, and will eventually flake if run as part of the test builds even if generous allowances are used.

I'm leaving two items for a future PR(s) for the sake of decreasing the burden of reviewing:

  1. RU consumption estimation for external IOs / non-vectorized plans.
  2. Adding a cluster setting to disable the RU estimation logic to give an escape hatch for the backport.

@DrewKimball
Copy link
Collaborator Author

Could someone take a look at the most recent changes? Sorry for all the updates, hopefully this is the end of it.

Copy link
Member

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some nits, but nothing blocking.

Curious though what prompted to make the change number 1?

Reviewed 33 of 33 files at r24, 21 of 21 files at r25, 16 of 16 files at r26, 3 of 3 files at r27, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @andy-kimball and @DrewKimball)


pkg/sql/plan_node_to_row_source.go line 160 at r26 (raw file):

	ctx = p.StartInternal(ctx, nodeName(p.node))
	p.params.ctx = ctx
	if execstats.ShouldCollectStats(ctx, p.FlowCtx.CollectStats) {

nit: let's move this to occur earlier. All other processors do this in their constructor, and I think we could do it in newPlanNodeToRowSource too. If not, after InitWithEvalCtx in Init should work.


pkg/sql/plan_node_to_row_source.go line 258 at r26 (raw file):

	scanStats := execstats.GetScanStats(p.Ctx, p.ExecStatsTrace)
	if scanStats.ConsumedRU == 0 {
		return nil

nit: what would happen if we returned a non-nil object with ConsumedRU of 0? I'd guess it'd be ignored later down the line. This would allow us to not add a non-nil check in the columnarizer. Up to you.


pkg/sql/pgwire/conn.go line 1418 at r24 (raw file):

	var conv sessiondatapb.DataConversionConfig
	buf := newWriteBuffer(nil /* byteCount */)

nit: we could reuse this writeBuffer between different invocations to save on some of the allocations. I'm thinking about introducing a struct that would hold on the write buffer and would export two methods (i.e. implement an interface defined in sql package). We'd probably then need to inject a constructor method into sql package. However, the write buffer already has some scratch space that the allocations might not happen, and this might not be a big deal after all. Up to you.

Copy link
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 33 of 33 files at r24, 21 of 21 files at r25, 16 of 16 files at r26, 3 of 3 files at r27, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @andy-kimball and @DrewKimball)

This commit adds a top-level field to the output of `EXPLAIN ANALYZE`
that shows the estimated number of RUs that would be consumed due to
network egress to the client. The estimate is obtained by buffering
each value from the query result in text format and then measuring the
size of the buffer before resetting it. The result is used to get the
RU consumption with the tenant cost config's `PGWireEgressCost` method.

**sql: surface query request units consumed due to cpu usage**

This commit adds the ability for clients to estimate the number of RUs
consumed by a query due to CPU usage. This is accomplished by keeping a
moving average of the CPU usage for the entire tenant process, then using
that to obtain an estimate for what the CPU usage *would* be if the query
wasn't running. This is then compared against the actual measured CPU usage
during the query's execution to get the estimate. For local flows this is
done at the `connExecutor` level; for remote flows this is handled by the
last outbox on the node (which gathers and sends the flow's metadata).
The resulting RU estimate is added to the existing estimate from network
egress and displayed in the output of `EXPLAIN ANALYZE`.

**sql: surface query request units consumed by IO**

This commit adds tracking for request units consumed by IO operations
for all execution operators that perform KV operations. The corresponding
RU count is recorded in the span and later aggregated with the RU consumption
due to network egress and CPU usage. The resulting query RU consumption
estimate is visible in the output of `EXPLAIN ANALYZE`.

**multitenantccl: add sanity testing for ru estimation**

This commit adds a sanity test for the RU estimates produced by running
queries with `EXPLAIN ANALYZE` on a tenant. The test runs each test query
several times with `EXPLAIN ANALYZE`, then runs all test queries without
`EXPLAIN ANALYZE` and compares the resulting actual RU measurement to the
aggregated estimates. For now, this test is disabled during builds because
it is flaky in the presence of background activity. For this reason it
should only be used as a manual sanity test.

Informs cockroachdb#74441

Release note (sql change): Added an estimate for the number of request units
consumed by a query to the output of `EXPLAIN ANALYZE` for tenant sessions.
Copy link
Collaborator Author

@DrewKimball DrewKimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found that a NPE was possible when the row implementation was used because the DistSQLReceiver isn't set up with the column types of the queried rows for EXPLAIN ANALYZE. I could have changed the function interface to also pass a types slice, but it seemed safer to explicitly set everything up for the purpose of buffering and counting bytes to ensure no more nasty surprises.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @andy-kimball and @yuzefovich)


pkg/sql/plan_node_to_row_source.go line 160 at r26 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: let's move this to occur earlier. All other processors do this in their constructor, and I think we could do it in newPlanNodeToRowSource too. If not, after InitWithEvalCtx in Init should work.

Moved it to InitWithOutput, since we need the FlowCtx. Done.


pkg/sql/plan_node_to_row_source.go line 258 at r26 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: what would happen if we returned a non-nil object with ConsumedRU of 0? I'd guess it'd be ignored later down the line. This would allow us to not add a non-nil check in the columnarizer. Up to you.

Yeah that would be fine, but the comment for ProcessorBaseNoHelper.ExecStatsForTrace says it can return nil, so it seems like we should do this anyway. Is there something special about the columnarizer so that it doesn't have to worry about that?


pkg/sql/pgwire/conn.go line 1418 at r24 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: we could reuse this writeBuffer between different invocations to save on some of the allocations. I'm thinking about introducing a struct that would hold on the write buffer and would export two methods (i.e. implement an interface defined in sql package). We'd probably then need to inject a constructor method into sql package. However, the write buffer already has some scratch space that the allocations might not happen, and this might not be a big deal after all. Up to you.

I like that idea, won't be significantly more complex and will save allocations. Done.


pkg/sql/rowexec/inverted_joiner.go line 782 at r15 (raw file):

Previously, andy-kimball (Andy Kimball) wrote…

Note that we don't calculate RU for non-tenants today, so ConsumedRU should be zero in those cases.

Done.

Copy link
Member

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 7 of 7 files at r28, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @andy-kimball and @DrewKimball)


pkg/sql/plan_node_to_row_source.go line 258 at r26 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

Yeah that would be fine, but the comment for ProcessorBaseNoHelper.ExecStatsForTrace says it can return nil, so it seems like we should do this anyway. Is there something special about the columnarizer so that it doesn't have to worry about that?

Oh yeah, you're right, let's keep the nil check in. For some reason I assumed that the meaning of "Can return nil" was equivalent to "This field can be nil".

@DrewKimball
Copy link
Collaborator Author

TFTRs!

@DrewKimball
Copy link
Collaborator Author

bors r+

@craig
Copy link
Contributor

craig bot commented Nov 18, 2022

Build succeeded:

@DrewKimball
Copy link
Collaborator Author

Here's an example of what the output looks like:

root@localhost:26269/defaultdb> CREATE TABLE xy (x INT, y INT);

CREATE TABLE


Time: 133ms total (execution 132ms / network 1ms)

root@localhost:26269/defaultdb> EXPLAIN ANALYZE INSERT INTO xy (SELECT t, t%27 FROM generate_series(1, 10000) g(t));

                 info
--------------------------------------
  planning time: 119ms
  execution time: 256ms
  distribution: local
  vectorized: true
  maximum memory usage: 100 KiB
  network usage: 0 B (0 messages)
  estimated RUs consumed: 20,553

  • insert
  │ nodes: n1
  │ actual row count: 1
  │ into: xy(x, y, rowid)
  │ auto commit
  │
  └── • render
      │
      └── • project set
          │ nodes: n1
          │ actual row count: 10,000
          │ estimated row count: 10
          │
          └── • emptyrow
                nodes: n1
                actual row count: 1
(24 rows)


Time: 376ms total (execution 376ms / network 0ms)

root@localhost:26269/defaultdb> EXPLAIN ANALYZE SELECT * FROM xy;

                        info
-----------------------------------------------------
  planning time: 293µs
  execution time: 110ms
  distribution: full
  vectorized: true
  rows read from KV: 10,000 (412 KiB, 1 gRPC calls)
  cumulative time spent in KV: 107ms
  maximum memory usage: 460 KiB
  network usage: 0 B (0 messages)
  estimated RUs consumed: 187

  • scan
    nodes: n1
    actual row count: 10,000
    KV time: 107ms
    KV contention time: 0µs
    KV rows read: 10,000
    KV bytes read: 412 KiB
    KV gRPC calls: 1
    estimated max memory allocated: 450 KiB
    missing stats
    table: xy@xy_pkey
    spans: FULL SCAN
(22 rows)


Time: 111ms total (execution 111ms / network 1ms)

root@localhost:26269/defaultdb> EXPLAIN ANALYZE SELECT count(*) FROM xy;

                        info
-----------------------------------------------------
  planning time: 295µs
  execution time: 4ms
  distribution: full
  vectorized: true
  rows read from KV: 10,000 (412 KiB, 1 gRPC calls)
  cumulative time spent in KV: 4ms
  maximum memory usage: 440 KiB
  network usage: 0 B (0 messages)
  estimated RUs consumed: 7

  • group (scalar)
  │ nodes: n1
  │ actual row count: 1
  │
  └── • scan
        nodes: n1
        actual row count: 10,000
        KV time: 4ms
        KV contention time: 0µs
        KV rows read: 10,000
        KV bytes read: 412 KiB
        KV gRPC calls: 1
        estimated max memory allocated: 430 KiB
        missing stats
        table: xy@xy_pkey
        spans: FULL SCAN
(26 rows)


Time: 5ms total (execution 5ms / network 0ms)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants