-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Watcher System Metrics #8338
Watcher System Metrics #8338
Conversation
@fspmarshall @andrejtokarcik Can you two review? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM once the watcherEventsEmitted
vec is reigned in a bit.
Count: int64(bucket.GetCumulativeCount()), | ||
UpperBound: bucket.GetUpperBound(), | ||
}) | ||
} | ||
} | ||
return out | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not a big deal but your new additions reveal getComponentHistogram
and getHistogram
share most of the code, making their respective purpose/differences unnecessarily unclear. I suppose they could be nicely unified by means of a third function with the signature func(metric *dto.MetricFamily, hist *dto.Histogram) Histogram
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refactored to getHistogramWithFilter(metric *dto.MetricFamily, histogramFilter func(metrics []*dto.Metric) *dto.Histogram) Histogram
to unite the shared code in one place and allow callers to provide filtering.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, this solution doesn't really seem to make the code more readable/succinct. Feel free to roll back to the original variant, as you prefer.
ee32bb6
to
582a3b7
Compare
Count: int64(bucket.GetCumulativeCount()), | ||
UpperBound: bucket.GetUpperBound(), | ||
}) | ||
} | ||
} | ||
return out | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, this solution doesn't really seem to make the code more readable/succinct. Feel free to roll back to the original variant, as you prefer.
582a3b7
to
2951df0
Compare
2951df0
to
a58817b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bot.
* add event watcher prometheus metrics and a new tctl top tab to visualize them
* add event watcher prometheus metrics and a new tctl top tab to visualize them (cherry picked from commit fb0ab2b)
Purpose
Currently we have no insight into which events are being emitted, how often they are being emitted, and what the size of an event is. This makes it difficult to determine the source of high network utilization issues. By exposing metrics for the watcher system and consuming them on a new pane in
tctl top
we can get a real time view of the events being emitted.Implementation
-- record events in
WatchEvents
streaming GRPC method before weSend
on the stream-- use
Event.Size()
to estimate the amount of data being senttctl top