-
Notifications
You must be signed in to change notification settings - Fork 134
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposed new strategic initiative: revamping tracing/metrics collection #853
Comments
I'm definitely supportive of an effort on the Diagnostics side. Do you want to summit a PR to added it to the strategic initiatives list? It would be great to have a top level issues that can be used to hold references to the ongoing/complete work. One other thing is how current metrics feed into reporting through Prometheus and anything we should be making available that can be exposed through modules like prom-client. |
Would make a good collab summit topic, too. |
Since this proposal is opened for one year, I'd like to ask if there is any further discussion on this? Also, I'm wondering if there is anything the diagnostics team could get involved or lead the discussion and following up actions on this topic since I see most areas in the topic are somewhat related to diagnostic tools. |
This has been open for almost a year since the last comment. I think we should likely close unless we can find a champion for the initiative. Otherwise related discussion can take place in the diagnostics wg. @jasnell unless you are still planning to work on this as announced in the original post is it ok if I close this issue? |
add http https http2 perfermance mertic. 😄 |
@jasnell I think this was meant as an FYI and since its been almost a year and a half since the FYI it can be closed. Please let me know if you think that was not the right thing to do. |
Node.js currently uses a number of different mechanisms for tracking performance metrics internally.
and so on.
These use a number of divergent mechanisms internally with very little consistency, making it complicated and cumbersome for someone to take a complete system-wide view of the metrics.
Further, Worker Threads make it even more difficult because some metrics become thread specific (e.g. process.memoryUsage()) while others are process wide.
Lastly, some of the mechanisms (DTrace, ETW and the trace events implementation) are under supported and problematic. The trace events implementation, for instance, will often abort under load when running worker threads because it has not yet been made fully thread safe. The team at google that had been working on the implementation is no longer engaged and has moved on to other things so the code has largely sat unfinished.
I have started investigating a top down overhaul of the metrics collection mechanisms in Node with the intent on providing a single, clear, coherent subsystem for per-process and per-isolate metrics tracking and reporting that will support multiple targets and use cases with a much cleaner implementation. A key goal will be to make it easier and more reliable to attach various analytics tools on top of Node.js (e.g. clinic.js, n|solid, apms, etc) without having to rely on hacks or building custom versions of the runtime. I also want to increase the visibility/observability of various key components of the platform and modernize metrics collection and reporting for tools such as Prometheus.
This will be a large effort that will take some time to get right and will require input from a number of folks. I'm still working through some work plan details now but I wanted to at least provide some notification that I was starting this effort.
/cc @nodejs/diagnostics @mmarchini @addaleax @sam-github @mcollina
The text was updated successfully, but these errors were encountered: