-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: measure CPU time spent during SQL execution #93952
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Can you share some testing results (manual, or maybe that single test that is disabled for tenants)?
Reviewed 12 of 12 files at r1, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @DrewKimball)
-- commits
line 6 at r1:
nit: it'd be good to explicitly mention the goroutines that SQL execution uses but which are currently not tracked by this work (since those should be relatively cheap). You mentioned on slack the async streamer goroutines, but I think the main goroutine of the hash router (colflow.HashRouter.Run
) as well as the outbox (colrpc.Outbox.Run
) are not tracked.
pkg/sql/colflow/stats.go
line 111 at r1 (raw file):
bic.ctx = ctx bic.stopwatch.Start() bic.cpuStopwatch.Start()
nit: do you think it'd be worth to extract an abstraction in this package that would combine two of the stop watches?
pkg/sql/execinfrapb/component_stats.proto
line 156 at r1 (raw file):
optional util.optional.Uint consumed_r_u = 4 [(gogoproto.nullable) = false]; // CPU time spent executing the component. optional util.optional.Duration c_p_u_time = 5 [(gogoproto.nullable) = false];
nit: you could use something like optional util.optional.Duration cpu_time = 5 [(gogoproto.nullable) = false, (gogoproto.customname) = "CPUTime"]
to get the same name. This seems a bit cleaner, but up to you (ditto for consumed_ru
).
pkg/util/timeutil/cpustopwatch.go
line 21 at r1 (raw file):
// CPUStopWatch is a utility stop watch for measuring CPU time spent by a // component. It can be safely started and stopped multiple times, but is // not safe to use concurrently. If CPUStopWatch is nil, all operations are
Hm, I think we needed the concurrency safety at some point, and I can't recall a PR that changed that. Do you think this is no longer needed?
Here's explain analyze output for a simple join query:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @yuzefovich)
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit: it'd be good to explicitly mention the goroutines that SQL execution uses but which are currently not tracked by this work (since those should be relatively cheap). You mentioned on slack the async streamer goroutines, but I think the main goroutine of the hash router (
colflow.HashRouter.Run
) as well as the outbox (colrpc.Outbox.Run
) are not tracked.
Done.
pkg/sql/colflow/stats.go
line 111 at r1 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit: do you think it'd be worth to extract an abstraction in this package that would combine two of the stop watches?
I embedded cpuStopWatch
in StopWatch
, let me know what you think.
pkg/sql/execinfrapb/component_stats.proto
line 156 at r1 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit: you could use something like
optional util.optional.Duration cpu_time = 5 [(gogoproto.nullable) = false, (gogoproto.customname) = "CPUTime"]
to get the same name. This seems a bit cleaner, but up to you (ditto forconsumed_ru
).
Agreed, done.
pkg/util/timeutil/cpustopwatch.go
line 21 at r1 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
Hm, I think we needed the concurrency safety at some point, and I can't recall a PR that changed that. Do you think this is no longer needed?
I didn't think it would be necessary for the vectorized stats collector, but didn't think it through very deeply. Either way, it's behind a mutex now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's nice to see that the execution time (the wall time) is slightly above the CPU time.
Reviewed 9 of 9 files at r2, all commit messages.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @DrewKimball)
pkg/sql/colflow/stats.go
line 111 at r1 (raw file):
Previously, DrewKimball (Drew Kimball) wrote…
I embedded
cpuStopWatch
inStopWatch
, let me know what you think.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed all commit messages.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @DrewKimball)
pkg/sql/opt/exec/explain/output.go
line 366 at r2 (raw file):
// AddCPUTime adds a top-level field for the cumulative cpu time spent by SQL // execution. func (ob *OutputBuilder) AddCPUTime(cpuTime time.Duration) {
Please add at least an e2e test verifying the output contains a valid CPU time.
2880abb
to
faa97cc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 9 of 21 files at r3, 14 of 14 files at r4, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @DrewKimball and @j82w)
-- commits
line 20 at r4:
I'm confused about the mention of kvBatchFetcherHelper
here - there are no modifications around that struct in this PR, and also the struct is not concerned with the time measurement on master either.
pkg/sql/instrumentation.go
line 567 at r4 (raw file):
ob.AddNetworkStats(queryStats.NetworkMessages, queryStats.NetworkBytesSent) ob.AddMaxDiskUsage(queryStats.MaxDiskUsage) if !ih.containsMutation && ih.vectorized && grunning.Supported() {
Remind me please why we're only doing this for vectorized plans? Due to not measuring the total execution time of a row-by-row processor?
pkg/sql/colflow/stats.go
line 276 at r4 (raw file):
// In order to account for SQL CPU time, we have to subtract the CPU time // spent while serving KV requests on a SQL goroutine. cpuTime -= vsc.kvReader.GetKVCPUTime()
I wonder whether we should also expose this new KVCPUTime
measurement.
Currently, on master we track a single total "execution" time of an operator, and then for TableReader
processor we show it as "KV time" whereas for all other readers as "execution time". Now that we also precisely track the actual "KV time", so we could be more precise with the terminology. As is, with the current change we have that "KV time" for ColBatchScans/TableReaders includes the "sql cpu time". I think it'd be better to hide the total execution time and, instead, show the KV time and the SQL CPU time (which, in theory, should add up to the total "execution time"). I'm under the impression that we can assume grunning
library is supported, right? WDYT? (There is some context in a comment in the code above.)
pkg/sql/colflow/stats.go
line 280 at r4 (raw file):
s.Exec.ExecTime.Set(time) } if cpuTime > 0 && grunning.Supported() {
nit: should we add an assertion that cpuTime
is non-negative?
pkg/sql/execstats/traceanalyzer.go
line 382 at r4 (raw file):
} for _, cpuTime := range a.nodeLevelStats.CPUTimeGroupedByNode {
nit: it might be good to modify TestQueryLevelStatsAccumulate
to also test CPUTime
.
pkg/sql/opt/exec/explain/output.go
line 366 at r2 (raw file):
Previously, j82w (Jake) wrote…
Please add at least an e2e test verifying the output contains a valid CPU time.
+1 we should have at least some sanity check that a positive CPU time value is measured.
Also, can the test introduced as skipped in #89256 be now unskipped? My understanding is that now the RU estimate should be more precise, or perhaps we could incorporate an adjustment to the error margin depending on the measured CPU time?
pkg/util/timeutil/cpustopwatch.go
line 21 at r4 (raw file):
// CPUStopWatch is a wrapper around cpuStopWatch that is safe to use // concurrently. If CpuStopWatch is nil, all operations are no-ops and no
nit: s/CpuStopWatch/CPUStopWatch/
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @j82w and @yuzefovich)
Previously, yuzefovich (Yahor Yuzefovich) wrote…
I'm confused about the mention of
kvBatchFetcherHelper
here - there are no modifications around that struct in this PR, and also the struct is not concerned with the time measurement on master either.
Sorry, that's a remnant from an earlier iteration. Done.
pkg/sql/instrumentation.go
line 567 at r4 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
Remind me please why we're only doing this for vectorized plans? Due to not measuring the total execution time of a row-by-row processor?
It's because we measure the CPU time using vectorizedStatsCollector
s. We can still measure it for row-wise processors; they just need to be wrapped in a columnarizer + statsCollector. I've expanded the comment.
pkg/sql/colflow/stats.go
line 276 at r4 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
I wonder whether we should also expose this new
KVCPUTime
measurement.Currently, on master we track a single total "execution" time of an operator, and then for
TableReader
processor we show it as "KV time" whereas for all other readers as "execution time". Now that we also precisely track the actual "KV time", so we could be more precise with the terminology. As is, with the current change we have that "KV time" for ColBatchScans/TableReaders includes the "sql cpu time". I think it'd be better to hide the total execution time and, instead, show the KV time and the SQL CPU time (which, in theory, should add up to the total "execution time"). I'm under the impression that we can assumegrunning
library is supported, right? WDYT? (There is some context in a comment in the code above.)
KVCPUTime
only tracks the local KV CPU time - any CPU time spent on a remote node / separate goroutine is omitted. It seems like it would be confusing to display that as-is, what do you think? Do you think I should change the name of the field to make that more clear?
pkg/sql/colflow/stats.go
line 280 at r4 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit: should we add an assertion that
cpuTime
is non-negative?
Done.
pkg/sql/execstats/traceanalyzer.go
line 382 at r4 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit: it might be good to modify
TestQueryLevelStatsAccumulate
to also testCPUTime
.
Good catch, done.
pkg/sql/opt/exec/explain/output.go
line 366 at r2 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
+1 we should have at least some sanity check that a positive CPU time value is measured.
Also, can the test introduced as skipped in #89256 be now unskipped? My understanding is that now the RU estimate should be more precise, or perhaps we could incorporate an adjustment to the error margin depending on the measured CPU time?
Added an end-to-end test that verifies positive values for queries that output CPU time, and no output for queries with mutations.
This PR doesn't refactor the RU estimation to use grunning
yet, and I think that should wait until I get it working with mutations.
pkg/util/timeutil/cpustopwatch.go
line 21 at r4 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit:
s/CpuStopWatch/CPUStopWatch/
.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 9 of 9 files at r5, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @DrewKimball and @j82w)
pkg/sql/instrumentation.go
line 567 at r4 (raw file):
Previously, DrewKimball (Drew Kimball) wrote…
It's because we measure the CPU time using
vectorizedStatsCollector
s. We can still measure it for row-wise processors; they just need to be wrapped in a columnarizer + statsCollector. I've expanded the comment.
Ok, so effectively we haven't added the necessary execution statistics collection for purely row-by-row plans since it's not important enough. Consider leaving a TODO still even if we don't plan to add it immediately.
pkg/sql/colflow/stats.go
line 276 at r4 (raw file):
It seems like it would be confusing to display that as-is, what do you think?
Yeah, fair point.
IIUC if we were to extend the CPU stop watches (that we are introducing in this PR to measure the KV on CPU time on the current goroutine) to also track the wall time, that wall time would be a proper KV time of an operator, right?
What I'm driving at is that KV time
of TableReader
on the DistSQL diagram now includes sql cpu time
which can be confusing. It'd be nice to clarify this. Before this change, we only measured the wall "execution" time of the operator (which would be shown as execution time
for non-readers and KV time
for readers). This change introduces sql cpu time
which measures on-core CPU time at the SQL level of an operator, excluding on-core KV time if any. This sql cpu time
is included into the "execution time".
I think as the next step it'd be nice to explicitly measure the KV wall time a reader spent waiting for a KV request to evaluate (regardless whether it was done remotely or locally, in the same goroutine or not). We will then show this new "KV time" statistic as KV time
on the diagram while showing "execution time" as execution time
for all operators (i.e. we'd change it for KV readers), and it'll be such that KV (wall) time + sql cpu time <= execution (wall) time
. Does this make sense?
It doesn't have to be in this PR but seems like a nice improvement with very little work after this change is done.
pkg/sql/opt/exec/explain/output_test.go
line 179 at r5 (raw file):
serverutils.InitTestServerFactory(server.TestServerFactory) tc := testcluster.StartTestCluster(t, numNodes, testClusterArgs)
nit: I think whenever we start a test cluster, we add
defer leaktest.AfterTest(t)()
defer log.Scope(t).Close(t)
to the top of the test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @j82w and @yuzefovich)
pkg/sql/instrumentation.go
line 567 at r4 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
Ok, so effectively we haven't added the necessary execution statistics collection for purely row-by-row plans since it's not important enough. Consider leaving a TODO still even if we don't plan to add it immediately.
Added a TODO to lift the restrictions on pure row-by-row plans and plans with mutations.
pkg/sql/colflow/stats.go
line 276 at r4 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
It seems like it would be confusing to display that as-is, what do you think?
Yeah, fair point.
IIUC if we were to extend the CPU stop watches (that we are introducing in this PR to measure the KV on CPU time on the current goroutine) to also track the wall time, that wall time would be a proper KV time of an operator, right?
What I'm driving at is that
KV time
ofTableReader
on the DistSQL diagram now includessql cpu time
which can be confusing. It'd be nice to clarify this. Before this change, we only measured the wall "execution" time of the operator (which would be shown asexecution time
for non-readers andKV time
for readers). This change introducessql cpu time
which measures on-core CPU time at the SQL level of an operator, excluding on-core KV time if any. Thissql cpu time
is included into the "execution time".I think as the next step it'd be nice to explicitly measure the KV wall time a reader spent waiting for a KV request to evaluate (regardless whether it was done remotely or locally, in the same goroutine or not). We will then show this new "KV time" statistic as
KV time
on the diagram while showing "execution time" asexecution time
for all operators (i.e. we'd change it for KV readers), and it'll be such thatKV (wall) time + sql cpu time <= execution (wall) time
. Does this make sense?It doesn't have to be in this PR but seems like a nice improvement with very little work after this change is done.
That sounds like a good idea. I'll add a TODO.
pkg/sql/opt/exec/explain/output_test.go
line 179 at r5 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit: I think whenever we start a test cluster, we add
defer leaktest.AfterTest(t)() defer log.Scope(t).Close(t)
to the top of the test.
Done.
4508294
to
2373476
Compare
This commit adds tracking for CPU time spent during SQL execution. The CPU time is tracked at the operator granularity when statistics collection is enabled, similar to execution time. For now, the CPU time is only surfaced in the output of `EXPLAIN ANALYZE` variants. A future PR will add support for logging this value in the statement statistics. Note that the main goroutines of the streamer, hash router, and outbox are not currently tracked by this work. However, it is expected that these should be relatively cheap, and shouldn't significantly impact the measurement. Additionally, KV work is performed on a SQL goroutine in some cases (e.g. when there is a single-range request for a local range). This makes it necessary to track CPU time spent fulfilling KV requests on a SQL goroutine so it can be subtracted from the total measured CPU time. This logic is handled by the `cFetcher` and `rowFetcherStatCollector` for the operators that only perform reads. Finally, because mutations do not record stats, they currently have no way to differentiate KV CPU time from SQL CPU time. For this reason, a plan that contains mutations will not output CPU time. Informs: cockroachdb#87213 Release note (sql change): CPU time spent during SQL execution is now visible in the output of queries run with `EXPLAIN ANALYZE`. This measure does not include CPU time spent while serving KV requests, and CPU time is not shown for queries that perform mutations or for plans that aren't vectorized. This can be useful for diagnosing performance issues and optimizing SQL queries.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 6 of 6 files at r6, all commit messages.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @j82w)
TFTR! bors r+ |
Build failed (retrying...): |
Build succeeded: |
This commit adds tracking for CPU time spent during SQL execution. The CPU time is tracked at the operator granularity when statistics collection is enabled, similar to execution time.
For now, the CPU time is only surfaced in the output of
EXPLAIN ANALYZE
variants. A future PR will add support for logging this value in the statement statistics.Informs: #87213
Release note (sql change): CPU time spent during SQL execution is now visible in the output of queries run with
EXPLAIN ANALYZE
. This measure does not include CPU time spent while serving KV requests. This can be useful for diagnosing performance issues and optimizing SQL queries.