Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Surface total time and contention time for each plan step in EXPLAIN ANALYZE #64200

Closed
kevin-v-ngo opened this issue Apr 26, 2021 · 0 comments · Fixed by #66157
Closed

Surface total time and contention time for each plan step in EXPLAIN ANALYZE #64200

kevin-v-ngo opened this issue Apr 26, 2021 · 0 comments · Fixed by #66157
Assignees
Labels
C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)

Comments

@kevin-v-ngo
Copy link

We introduced a new view for EXPLAIN ANALYZE and received feedback to surface the total time spent and the contention time for each plan step.

The metrics are already available in the DistSQL plan viewer for each plan node:
image

Please add time and contention time to EXPLAIN ANALYZE:
image

@kevin-v-ngo kevin-v-ngo added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-sql-observability labels Apr 26, 2021
craig bot pushed a commit that referenced this issue Jun 8, 2021
65559: tracing,tracingservice: adds a trace service to pull clusterwide trace spans r=irfansharif,abarganier a=adityamaru

Previously, every node in the cluster had a local inflight span
registry that was aware of all the spans that were rooted on that
particular node. Child spans of a given traceID executing on a remote
node would only become visible to the local registry once execution
completes, and the span pushes its recordings over gRPC to the
"client" node.

This change introduces a `tracingservice` package.
Package tracingservice contains a gRPC service to be used for
remote inflight span access.

It is used for pulling inflight spans from all CockroachDB nodes.
Each node will run a trace service, which serves the inflight spans from the
local span registry on that node. Each node will also have a trace client
dialer, which uses the nodedialer to connect to another node's trace service,
and access its inflight spans. The trace client dialer is backed by a remote
trace client or a local trace client, which serves as the point of entry to this
service. Both clients support the `TraceClient` interface, which includes the
following functionalities:
  - GetSpanRecordings

The spans for a traceID are sorted by `StartTime` before they are
returned. The per-node trace dialer has yet to be hooked up to an
appropriate location depending on where we intend to use it.

Resolves: #60999
Informs: #64992

Release note: None

66149: cloud: fix gcs to resuming reader r=dt a=adityamaru

This change does a few things:

1. gcs_storage was not returning a resuming reader as a result of
which the Read method of the resuming reader that contains logic
to retry on certain kinds of errors was not being invoked.

2, Changes the resuming reader to take a storage specific function
that can define what errors are retryable in the resuming reader.
All storage providers use the same deciding function at the moment
and so behavior is unchanged.

Release note: None

66152: storage: Disable read sampling and read compactions r=sumeerbhola a=itsbilal

Read-triggered compactions are already disabled on 21.1.
As the fixes to address known shortcomings with read-triggered
compactions are a bit involved (see
cockroachdb/pebble#1143 ), disable
the feature on master until that issue is fixed. That prevents
this known issue from getting in the way of performance
experiments.

Release note: None.

66155: sql: drop "cluster" from EXPLAIN ANALYZE to improve readability r=maryliag a=maryliag

Remove the word "cluster" from "cluster nodes" and "cluster regions"
on EXPLAIN ANALYZE to improve readability.

Release note: None

66157: sql: add time & contention time to EXPLAIN ANALYZE. r=matthewtodd a=matthewtodd

The new fields are labeled `KV time` and `KV contention time`:

```
 > EXPLAIN ANALYZE
-> UPDATE users SET name = 'Bob Loblaw'
-> WHERE id = '32a962b7-8440-4b81-97cd-a7d7757d6eac';
                                            info
--------------------------------------------------------------------------------------------
  planning time: 353µs
  execution time: 3ms
  distribution: local
  vectorized: true
  rows read from KV: 52 (5.8 KiB)
  cumulative time spent in KV: 2ms
  maximum memory usage: 60 KiB
  network usage: 0 B (0 messages)
  cluster regions: us-east1

  • update
  │ cluster nodes: n1
  │ cluster regions: us-east1
  │ actual row count: 1
  │ table: users
  │ set: name
  │ auto commit
  │
  └── • render
      │ cluster nodes: n1
      │ cluster regions: us-east1
      │ actual row count: 1
      │ estimated row count: 0
      │
      └── • filter
          │ cluster nodes: n1
          │ cluster regions: us-east1
          │ actual row count: 1
          │ estimated row count: 0
          │ filter: id = '32a962b7-8440-4b81-97cd-a7d7757d6eac'
          │
          └── • scan
                cluster nodes: n1
                cluster regions: us-east1
                actual row count: 52
                KV time: 2ms
                KV contention time: 0µs
                KV rows read: 52
                KV bytes read: 5.8 KiB
                estimated row count: 50 (100% of the table; stats collected 3 minutes ago)
                table: users@primary
                spans: FULL SCAN
(42 rows)

Time: 4ms total (execution 4ms / network 0ms)
```

Resolves #64200

Release note (sql change): EXPLAIN ANALYZE output now includes, for each plan step, the total time spent waiting for KV requests as well as the total time those KV requests spent contending with other transactions.

Co-authored-by: Aditya Maru <adityamaru@gmail.com>
Co-authored-by: Bilal Akhtar <bilal@cockroachlabs.com>
Co-authored-by: Marylia Gutierrez <marylia@cockroachlabs.com>
Co-authored-by: Matthew Todd <todd@cockroachlabs.com>
@craig craig bot closed this as completed in 522b64c Jun 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants