-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
storage: consider aggregating iterator stats #95790
Comments
Hi @jbowens, please add a C-ategory label to your issue. Check out the label system docs. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
In addition to time series, we should also aggregate per-replica counters (all iterator usage in CRDB is tied to a replica), which allows us to find "hot ranges" over more dimensions (like skipped tombstones). This shouldn't be step one since it requires annotating all iterators that we create with a RangeID, which can be cumbersome. |
In https://github.com/cockroachlabs/support/issues/2182#issuecomment-1482809892, we had nodes exhibit high disk read throughput, and I knew there were changefeed restarts. But it's difficult to determine whether these explained the disk read bandwidth or not. This is related to both this issue and #65414. If operations that can consume significant amounts of disk bandwidth had their own metrics that would feed off the stats of the iterators they use, we'd be in a much better position to "explain" observed disk throughput. Probably capturing the entire IteratorStats for each operation is overkill, but some basic numbers like bytes read from disk make sense. "Operations" I'm thinking about here are "evaluating request of type X" or "rangefeed catch-up scan" or "snapshot streaming" or "consistency check" and so on. |
Aggregate the iterator stats across all of an engine's iterators. Expose seven new timeseries metrics for visibility into the behavior of storage engine iterators: - storage.iterator.block-load.bytes - storage.iterator.block-load.cached-bytes - storage.iterator.block-load.read-duration - storage.iterator.external.seeks - storage.iterator.external.steps - storage.iterator.internal.seeks - storage.iterator.internal.steps Close cockroachdb#95790. Epic: None Release note (ops change): Introduces seven new timeseries metrics for better visibility into the behavior of storage engine iterators and their internals.
Aggregate the iterator stats across all of an engine's iterators. Expose seven new timeseries metrics for visibility into the behavior of storage engine iterators: - storage.iterator.block-load.bytes - storage.iterator.block-load.cached-bytes - storage.iterator.block-load.read-duration - storage.iterator.external.seeks - storage.iterator.external.steps - storage.iterator.internal.seeks - storage.iterator.internal.steps Close cockroachdb#95790. Epic: None Release note (ops change): Introduces seven new timeseries metrics for better visibility into the behavior of storage engine iterators and their internals.
Aggregate the iterator stats across all of an engine's iterators. Expose seven new timeseries metrics for visibility into the behavior of storage engine iterators: - storage.iterator.block-load.bytes - storage.iterator.block-load.cached-bytes - storage.iterator.block-load.read-duration - storage.iterator.external.seeks - storage.iterator.external.steps - storage.iterator.internal.seeks - storage.iterator.internal.steps Close cockroachdb#95790. Epic: None Release note (ops change): Introduces seven new timeseries metrics for better visibility into the behavior of storage engine iterators and their internals.
Aggregate the iterator stats across all of an engine's iterators. Expose seven new timeseries metrics for visibility into the behavior of storage engine iterators: - storage.iterator.block-load.bytes - storage.iterator.block-load.cached-bytes - storage.iterator.block-load.read-duration - storage.iterator.external.seeks - storage.iterator.external.steps - storage.iterator.internal.seeks - storage.iterator.internal.steps Close cockroachdb#95790. Epic: None Release note (ops change): Introduces seven new timeseries metrics for better visibility into the behavior of storage engine iterators and their internals.
Aggregate the iterator stats across all of an engine's iterators. Expose seven new timeseries metrics for visibility into the behavior of storage engine iterators: - storage.iterator.block-load.bytes - storage.iterator.block-load.cached-bytes - storage.iterator.block-load.read-duration - storage.iterator.external.seeks - storage.iterator.external.steps - storage.iterator.internal.seeks - storage.iterator.internal.steps Close cockroachdb#95790. Epic: None Release note (ops change): Introduces seven new timeseries metrics for better visibility into the behavior of storage engine iterators and their internals.
99726: storage: aggregate iterator stats r=jbowens a=jbowens Aggregate the iterator stats across all of an engine's iterators. Expose seven new timeseries metrics for visibility into the behavior of storage engine iterators: - storage.iterator.block-load.bytes - storage.iterator.block-load.cached-bytes - storage.iterator.block-load.read-duration - storage.iterator.external.seeks - storage.iterator.external.steps - storage.iterator.internal.seeks - storage.iterator.internal.steps Close #95790. Epic: None Release note (ops change): Introduces seven new timeseries metrics for better visibility into the behavior of storage engine iterators and their internals. Co-authored-by: Jackson Owens <jackson@cockroachlabs.com>
Aggregate the iterator stats across all of an engine's iterators. Expose seven new timeseries metrics for visibility into the behavior of storage engine iterators: - storage.iterator.block-load.bytes - storage.iterator.block-load.cached-bytes - storage.iterator.block-load.read-duration - storage.iterator.external.seeks - storage.iterator.external.steps - storage.iterator.internal.seeks - storage.iterator.internal.steps Close #95790. Epic: None Release note (ops change): Introduces seven new timeseries metrics for better visibility into the behavior of storage engine iterators and their internals.
Pebble Iterators expose iterator statistics. These statistics are included in SQL traces, and can be helpful in determining where reads are spending time. We can consider aggregating these iterator statistics across all of a node's iterators, to surface time series metrics. These metrics might be helpful in diagnosing problems where slow/expensive reads are substantial enough to affect node health.
Jira issue: CRDB-23742
The text was updated successfully, but these errors were encountered: