-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Metrics based on telemetry events generated by sparrow #175
Conversation
fa2d95f
to
182f40f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR! It's good to have metrics showing how much time was spent in the HTTP/2 worker. Metrics showing worker pool statistics are also valuable.
On a side note, I think it'd be worth checking if we use the newset versions of the Telemetry
releated libraries.
2. Send a request. | ||
3. Get a response from push notifications provider. | ||
|
||
###### HTTP/2 requests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this metric show time spent only on sending the request and waiting for the response?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This metric shows only the time it takes to handle and send request. It only opens the stream and sends the request. It does not measure response time. In general, when this timer is small it it possible that we are getting errors with opening the stream.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for checking this. I think it'd be good to add this info to the doc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, will add that info.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@michalwski, this part is already added to the doc..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I wonder how did you come to the conclusion that it measures only the time it takes to handle and send the request. If I'm correct this metric measures the time of this function in sparrow: https://github.com/esl/sparrow/blob/master/lib/sparrow/h2_worker.ex#L352-L430
Inside the function, we are waiting for the response and emit additional events depending on the response.
Also, I think the name of the metric starts with sparrow_h2_worker
instead of sparrow_h_worker
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All right, we discussed this offline with @janciesla8818. I was misled by the post_result
var name and the case checking if this is a successful response or not. In fact, this var gives us information if the request was sent successfully or not. The response to the request arrives asynchronously (HTTP/2). Thanks to that we can send many requests using the same h2_worker
and connection.
When it comes to the metric name, the Prometheus.Core
lib removes all numbers from the metric name.
5aa77a1
to
d09265d
Compare
d09265d
to
f864311
Compare
This PR adds new metrics based on sparrow telemetry events. The changes include:
Last three metrics are gathered using
:wpool.stats
to get the information from worker pool.Telemetry poller
is added to the project to collect periodic metrics. Counters of pools and workers are collected as events generated during telemetry poller events. The period for gathering metrics is 5 seconds which is the default value for telemetry poller.