metrics: Increase the resolution of histogram metrics #7335

alexggh · 2023-06-05T18:09:17Z

These metrics are using the default histogram buckets:

pub const DEFAULT_BUCKETS: &[f64; 11] = &[
    0.005, 0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 1.0, 2.5, 5.0, 10.0,
];

Which give us a resolution of 5ms, that's good, but there are some subsystems where we process hundreds or even a few thousands of messages per second like approval-voting or approval-distribution, so it makes sense to increse the resoution of the bucket to better understand if the procesisng is in the range of useconds.

The new bucket ranges will be:

[0.0001, 0.0004, 0.0016, 0.0064, 0.0256, 0.1024, 0.4096, 1.6384, 6.5536]

These metrics are using the default histogram buckets: ``` pub const DEFAULT_BUCKETS: &[f64; 11] = &[ 0.005, 0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 1.0, 2.5, 5.0, 10.0, ]; ``` Which give us a resolution of 5ms, that's good, but there are some subsystems where we process hundreds or even a few thousands of messages per second like approval-voting or approval-distribution, so it makes sense to increse the resoution of the bucket to better understand if the procesisng is in the range of useconds. The new bucket ranges will be: ``` [0.0001, 0.0004, 0.0016, 0.0064, 0.0256, 0.1024, 0.4096, 1.6384, 6.5536] ``` Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>

sandreim

Thanks @alexggh , this is indeed a good idea. Also it might also be worthy, but up to you, to have 2 additional buckets between the last 2: [0.0001, 0.0004, 0.0016, 0.0064, 0.0256, 0.1024, 0.4096, 1.6384, <here>, 6.5536,], just to give more granular info for any very high timings.

alexggh · 2023-06-06T08:22:49Z

Thanks @alexggh , this is indeed a good idea. Also it might also be worthy, but up to you, to have 2 additional buckets between the last 2: [0.0001, 0.0004, 0.0016, 0.0064, 0.0256, 0.1024, 0.4096, 1.6384, <here>, 6.5536,], just to give more granular info for any very high timings.

yes, makes sense, I won't be able to use the exponential_buckets helper function for that, so I'll define a static vec for that.

Done!

Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>

alexggh · 2023-06-06T14:30:18Z

Unfortunately, I ran this PR in versi on subset of nodes and it seems that grafana get's confused on situations where some nodes output a range of buckets and the other another different range:

The graphic for a 6h period for versi would look like some data is missing, e.g:

The data is there, and the graphic is shown ok if you select a time period when only just a flavor of the buckets have been used, but it will affect the dashboards for the transition period.

@sandreim any idea if this is something to be concerned with ?

sandreim · 2023-06-06T15:31:31Z

This happens when you aggregate metrics from nodes with different bucket configurations. I expect this to be fine. You should try to query the subset of nodes running your changes and select only the time period when only that bucket configuration is present. That should work.

alexggh · 2023-06-06T16:16:02Z

This happens when you aggregate metrics from nodes with different bucket configurations. I expect this to be fine.

Yes, that's exactly what is happening .

You should try to query the subset of nodes running your changes and select only the time period when only that bucket configuration is present. That should work.

Yes, that works.

sandreim · 2023-06-08T09:15:58Z

bot merge

paritytech-processbot · 2023-06-08T09:16:05Z

Error: Statuses failed for 9efcf3d

sandreim · 2023-06-08T09:26:28Z

bot merge

alexggh self-assigned this Jun 5, 2023

sandreim approved these changes Jun 6, 2023

View reviewed changes

Use buckets with higher resolution

9efcf3d

Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>

alexggh force-pushed the feature/increase_resolution branch from 3b754ce to 9efcf3d Compare June 6, 2023 08:46

bredamatt approved these changes Jun 8, 2023

View reviewed changes

vstakhov approved these changes Jun 8, 2023

View reviewed changes

paritytech-processbot bot merged commit 6debcdb into paritytech:master Jun 8, 2023

crystalin mentioned this pull request Oct 20, 2023

Update substrate/polkadot/cumulus from v0.9.43 to v1.1.0 moonbeam-foundation/moonbeam#2535

Closed

JesseAbram mentioned this pull request Oct 26, 2023

Update substrate and subxt entropyxyz/entropy-core#435

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

metrics: Increase the resolution of histogram metrics #7335

metrics: Increase the resolution of histogram metrics #7335

alexggh commented Jun 5, 2023

sandreim left a comment

alexggh commented Jun 6, 2023 •

edited

Loading

alexggh commented Jun 6, 2023

sandreim commented Jun 6, 2023

alexggh commented Jun 6, 2023 •

edited

Loading

sandreim commented Jun 8, 2023

paritytech-processbot bot commented Jun 8, 2023

sandreim commented Jun 8, 2023

metrics: Increase the resolution of histogram metrics #7335

metrics: Increase the resolution of histogram metrics #7335

Conversation

alexggh commented Jun 5, 2023

sandreim left a comment

Choose a reason for hiding this comment

alexggh commented Jun 6, 2023 • edited Loading

alexggh commented Jun 6, 2023

sandreim commented Jun 6, 2023

alexggh commented Jun 6, 2023 • edited Loading

sandreim commented Jun 8, 2023

paritytech-processbot bot commented Jun 8, 2023

sandreim commented Jun 8, 2023

alexggh commented Jun 6, 2023 •

edited

Loading

alexggh commented Jun 6, 2023 •

edited

Loading