-
Notifications
You must be signed in to change notification settings - Fork 378
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve Event Loop Lag Metric #370
Comments
Hi, appreciate the feedback, and reading through your comments in detail is on my list. Any interest in making a PR? Also, pending me looking more closely, do you think some of this should be reported directly to nodejs as issues with our core APIs? |
Hey @ChristianBoehlke I came across this issue. Maybe you can chime in in this other related issue I just opened: nodejs/node#34661 Not exactly the same thing, but I thought it might interest you |
@ChristianBoehlke I observed the same behavior and would be interested in your changes. Did you ever had the chance to open this PR? |
@ChristianBoehlke: Hey, I hope that you're well. Word on the block is that you have an awesome fix for event loop lag metrics. Is this true? |
Hello! Sorry I totally forgot about this. I have some time this weekend to open up a PR with our implementation. |
Good evening, @ChristianBoehlke ! Are you able to check my PR and share your thoughts regarding it? My solution is much simpler than proposed by you (one line added actually) and I am interested to know, why more complex solution can be needed, if it is actually needed. |
Seems that @ChristianBoehlke is too busy to respond on this. From our side i would like to share experience of using #459 fork We have now exactly that information, that we want to see from beginning:
I still have no idea, why not just use that libuv's histogram, provided by |
@yarsky-tgz's PR #459 looks reasonable and seems to address the very valid issue raised here. The discussion in the Node.js issued linked in the OP are useful. In particular, it looks like we should expose an Event Loop Utilization default metric to help with interpretation (added in v12.19, 14.10, https://nodejs.org/dist/latest-v14.x/docs/api/perf_hooks.html#perf_hooks_performance_eventlooputilization_utilization1_utilization2). |
Fixed in 0f444cd (v14 coming out tonight). |
Is it a resolved issue (as of #370 (comment))? |
Hello,
Thank you for all the work with this library!
We are using our own implementation of the event loop lag metric because the default metric did not work well for us and I was wondering whether this is something we could contribute back? Here are the two things we have changed:
Reset libuv-Histogram
The
mean
/max
/percentiles
that libuv provides (throughperf_hooks.monitorEventLoopDelay
) are never reset. After a couple of days all numbers are pretty much set and do not change anymore. There is no way to even distinguish between high load times during the day and quiet times during the night. I think it would be better to expose a moving average/min/max instead of the total-average/min/max.By default the
resolution
that libuv uses is 10ms which means that libuv generates ~100 measurements per second. It would be possible to reset libuv's histogram every second (or so) and record itsmean
andmax
in circular buffers. The length of the buffers should be a bit longer than Prometheus' scape interval and get averaged when collected.Instead of the circular buffer, a histogram could be used as well (proposed in #309).
There is some discussion about this in #278.
Subtract
resolution
from valueslibuv uses a timer and records the time that has passed since the last invocation. This means that all values that libuv provides have the
resolution
value "added" to them. I think it would be better if this library could subtract this number before exposing the metrics. This could also be done in Prometheus but this is a rather specific implementation detail and could vary from process to process because it is configurable.There is some discussion about changing how libuv measures the lag but this didn't get merged: nodejs/node#32018 and nodejs/node#32102. There is also some future work in libuv/libuv#2725.
I am happy to open a PR for this so that you can have a look.
The text was updated successfully, but these errors were encountered: