-
Notifications
You must be signed in to change notification settings - Fork 3.9k
*: conclusively attribute CPU usage to SQL queries and sessions #60508
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oh i can't wait to see this
…On Thu, Feb 11, 2021 at 4:43 PM cockroach-teamcity ***@***.***> wrote:
This change is [image: Reviewable]
<https://reviewable.io/reviews/cockroachdb/cockroach/60508>
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#60508 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AFJ7F77SOA2D2WD7LWKX5K3S6RFRBANCNFSM4XPTBUWQ>
.
--
Andy Woods
Group Product Manager, Cockroach Labs
andy@cockroachlabs.com
|
tbg
added a commit
to tbg/cockroach
that referenced
this pull request
Feb 19, 2021
Background profiling is an experimental patch that was floated as a way to get granular CPU usage information in golang/go#23458. I decided to dust off the Golang patch that implements this (https://go-review.googlesource.com/c/go/+/102755) and see for myself what you actually get. The output is something like this: ``` bgprof_test.go:51: ts: 12h37m15.540022251s labels: map[foo:bar] fn: hash/crc32.ieeeCLMUL bgprof_test.go:51: ts: 12h37m15.54801909s labels: map[foo:bar] fn: hash/crc32.ieeeCLMUL bgprof_test.go:51: ts: 12h37m15.560015269s labels: map[foo:bar] fn: hash/crc32.ieeeCLMUL bgprof_test.go:51: ts: 12h37m15.568026639s labels: map[foo:bar] fn: hash/crc32.ieeeCLMUL bgprof_test.go:51: ts: 12h37m15.580029608s labels: map[foo:bar] fn: hash/crc32.ieeeCLMUL ... ``` If we used profiler labels to identify queries (similar to what's shown in cockroachdb#60508) I guess I can see how you can somehow build a profile from this stream and then use the tag breakdown of the profile to reason about CPU allocated to each query. It seems wildly experimental and also unergonomical, though. It seems like we'd get maybe roughly the same, without as much bending over backwards and wildly experimental bleeding edge, by relying on periodic foreground sampling at a lower frequency (`runtime.SetCPUProfileRate`) than the default 100hz. To run this PR (which you shouldn't need to do for anything, but still) you'd clone `cockroachdb/go`, check out the [bgprof] branch, run `./make.bash` in `src` and change your PATH so that `$(which go)` is `bin/go`. Release note: None
When running experiment.sh with three nodes, there's always some missing-client and missing-server; I think this must be some DistSQL behavior breaking the context (since I do not see this issue with a single node and have disabled the local internal client optimization). TODO: return an error from node.batch in that case to see which sql ops have it bubble up. Release note: None
c477641
to
07d5a51
Compare
86bf7aa
to
651b855
Compare
What needs to be done to productionize the prototype:
|
@kevin-v-ngo this seems like something really useful. I wonder what you think of it? |
This was referenced Feb 25, 2022
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This prototype demonstrates how, with a modest technical investment, we can
improve make the CPU utilization observable on a granular level (SQL sessions,
statements, etc) using Goroutine labels and the Go CPU profiler
Before we dive into what the prototype does, here's a status quo. We are already supporting profiler labels to some degree, though in an unpolished form that is hard to capitalize on. Concretely, we set profiler labels here:
cockroach/pkg/sql/conn_executor_exec.go
Lines 105 to 121 in 401e9d4
This snippet also shows that we have infrastructure that tells us when to
enable Goroutine labels, so that we only do so if the process is actively being
traced (with labels requested). This is relevant because Goroutine labels are
far from free. The implementation in the runtime is not very allocation
efficient: you need to wrap a map in a Context, and the runtime makes an
additional copy of this map internally. The internal copy cannot be avoided,
but many of the surrounding allocations could be with appropriate pooling and
perhaps a custom impl of
context.Context
that we use for the sole purpose ofpassing the map into the runtime. Either way, the take-away is that getting the
overhead down will be important if we are to rely on this mechanism more, as it
would be a shame to make the system a lot slower when trying to find out why it
is so slow in the first place.
To summarize the status quo, we are also already taking label-enabled profiles
of all nodes in the cluster in debug.zip, and also add a script to show a breakdown
by tags. As far as I know, these have never really been used.
The goal:
Given a cluster in which (one or many) nodes experience high CPU usage, a TSE/SRE
can easily identify
Stretch goal (not included but could be worth thinking about)
it (at low profiling frequency), and
statement/txn stats (maybe just store it before the hourly reset), it seems pretty
powerful.
What this prototype does concretely is focus only on 1) and 2) with no attempt to
optimize performance.
if there's any work done on behalf of the dummy marker, which would indicate that we were
not properly labelling everywhere.
because I realized that significant chunks of CPU time are spent outside of
execStmt
,and most of the work in the prototype was to arrive at a system that captured enough of
the work so that we could comfortably "deduce" that a query is responsible for overload.
distributed SQL flows: if the profile stopped at the gateway, but the CPU was spent on
another node, it wouldn't be associated to the query. Note that all nodes need to actively
profile (or at least attach goroutine labels) at around the same time, or the links will
be lost.
bypasses the interceptors; it would need a more holistic solution before shipping.
Learnings from the prototype:
context.Context
and the actual labels in sync manually (as the gRPC interceptors can only get the labels from there; they can't read them off the goroutine). Labels automatically (via the runtime) get propagated to child goroutines, but we often don't do the same thing forContext
- because doing so also propagates the cancellation. The prime example of this isstopper.RunAsyncTask
, where we often derive fromcontext.Background
for the task. We need to change the Stopper to automatically put the labels of the parent goroutine into the children it starts, which is not too difficult. Next, this project would lend further weight to ban the Go keyword; the Stopper should be our authority here.num_vcpus * profile_duration
, then the system is pretty much overloaded (they should also see that from the runtime metrics). Since we promise that we are putting labels on all moving parts, we can assume that the seconds accrued by those labels are "100% of the productive work" and that this is the whole category to look at. But this needs to be verified experimentally. It could be possible for some SQL query to be low-CPU but extremely alloc heavy, so that it would drive the garbage collector nuts.