fix: improve runtime metrics #6344

BridgeAR · 2025-08-26T00:38:05Z

The current implementation for runtime.node.event_loop.utilization always calculated a diff from the former entry and replaced the former entry with that diff. That in itself worked for the first two calls, but not afterwards, since we now receive a diff of the diff and that is incorrect.
It is now replaced by directly calculating the utilization instead.

The overall implementation is also a tad faster by removing some calls to native side as well as using process.hrtime.bigint() instead of process.hrtime().

Code for unsupported Node.js versions was also removed as well as adding a comment about the overall metric types and where to find those.

The current implementation for runtime.node.event_loop.utilization always calculated a diff from the former entry and replaced the former entry with that diff. That in itself worked for the first two calls, but not afterwards, since we now receive a diff of the diff and that is incorrect. It is now replaced by directly calculating the utilization instead. The overall implementation is also a tad faster by removing some calls to native side as well as using simpler logic. Code for unsupported Node.js versions was also removed as well as adding a comment about the overall metric types and where to find those.

github-actions · 2025-08-26T00:38:40Z

Overall package size

Self size: 11.92 MB
Deduped: 111.53 MB
No deduping: 111.87 MB

Dependency sizes

| name | version | self size | total size | |------|---------|-----------|------------| | @datadog/libdatadog | 0.7.0 | 35.02 MB | 35.02 MB | | @datadog/native-appsec | 10.1.0 | 20.37 MB | 20.37 MB | | @datadog/native-iast-taint-tracking | 4.0.0 | 11.72 MB | 11.73 MB | | @datadog/pprof | 5.9.0 | 9.77 MB | 10.1 MB | | @opentelemetry/core | 1.30.1 | 908.66 kB | 7.16 MB | | protobufjs | 7.5.4 | 2.95 MB | 5.6 MB | | @datadog/wasm-js-rewriter | 4.0.1 | 2.85 MB | 3.58 MB | | @datadog/native-metrics | 3.1.1 | 1.02 MB | 1.43 MB | | @opentelemetry/api | 1.8.0 | 1.21 MB | 1.21 MB | | jsonpath-plus | 10.3.0 | 617.18 kB | 1.08 MB | | import-in-the-middle | 1.14.2 | 122.36 kB | 850.93 kB | | lru-cache | 10.4.3 | 804.3 kB | 804.3 kB | | opentracing | 0.14.7 | 194.81 kB | 194.81 kB | | source-map | 0.7.6 | 185.63 kB | 185.63 kB | | pprof-format | 2.1.0 | 111.69 kB | 111.69 kB | | @datadog/sketches-js | 2.1.1 | 109.9 kB | 109.9 kB | | lodash.sortby | 4.7.0 | 75.76 kB | 75.76 kB | | ignore | 7.0.5 | 63.38 kB | 63.38 kB | | istanbul-lib-coverage | 3.2.2 | 34.37 kB | 34.37 kB | | rfdc | 1.4.1 | 27.15 kB | 27.15 kB | | dc-polyfill | 0.1.10 | 26.73 kB | 26.73 kB | | @isaacs/ttlcache | 1.4.1 | 25.2 kB | 25.2 kB | | tlhunter-sorted-set | 0.1.0 | 24.94 kB | 24.94 kB | | shell-quote | 1.8.3 | 23.74 kB | 23.74 kB | | limiter | 1.1.5 | 23.17 kB | 23.17 kB | | retry | 0.13.1 | 18.85 kB | 18.85 kB | | semifies | 1.0.0 | 15.84 kB | 15.84 kB | | jest-docblock | 29.7.0 | 8.99 kB | 12.76 kB | | crypto-randomuuid | 1.0.0 | 11.18 kB | 11.18 kB | | ttl-set | 1.0.0 | 4.61 kB | 9.69 kB | | mutexify | 1.4.0 | 5.71 kB | 8.74 kB | | path-to-regexp | 0.1.12 | 6.6 kB | 6.6 kB | | koalas | 1.0.2 | 6.47 kB | 6.47 kB | | module-details-from-path | 1.0.4 | 3.96 kB | 3.96 kB |

_{🤖 This report was automatically generated by heaviest-objects-in-the-universe}

codecov · 2025-08-26T00:38:53Z

Codecov Report

❌ Patch coverage is 95.53571% with 5 lines in your changes missing coverage. Please review.
✅ Project coverage is 83.99%. Comparing base (907d8e8) to head (94eaf7f).
⚠️ Report is 1 commits behind head on master.

Files with missing lines	Patch %	Lines
...es/dd-trace/src/runtime_metrics/runtime_metrics.js	95.04%	5 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #6344      +/-   ##
==========================================
+ Coverage   83.76%   83.99%   +0.22%     
==========================================
  Files         477      477              
  Lines       20060    20071      +11     
==========================================
+ Hits        16804    16858      +54     
+ Misses       3256     3213      -43

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

datadog-datadog-prod-us1 · 2025-08-26T00:46:41Z

⚠️ Tests

⚠️ Warnings

❄️ 1 New flaky test detected

cypress@14.5.4 esm intelligent test runner can skip tests received by the intelligent test runner API and still reports code coverage from integration-tests/cypress/cypress.spec.js (Datadog)
Timeout

ℹ️ Info

🧪 All tests passed

_{This comment will be updated automatically if new data arrives.

🔗 Commit SHA: eb5a4aa | Docs | Was this helpful? Give us feedback!}

pr-commenter · 2025-08-26T00:50:21Z

Benchmarks

Benchmark execution time: 2025-08-29 12:29:43

Comparing candidate commit eb5a4aa in PR branch BridgeAR/2025-08-25-fix-eventloop-utilization-metric with baseline commit 7830469 in branch master.

Found 0 performance improvements and 0 performance regressions! Performance is the same for 1268 metrics, 55 unstable metrics.

Some conditions were not correctly handled before. For example did we still send out metrics in case options deactivated the gc or eventloop metrics. They were just always zero in that case. On top of that did we check for old Node.js support. That is not needed and was removed. The time was miscalulcated for the CPU metrics in the former commit. The divisor had to become smaller, not bigger. This is also fixed.

This also improves the code a bit

The config was using filters when they weren't needed. This is now a simple loop instead. The received config had to be truthy, otherwise it would have failed in the code afterwards.

This enables event loop runtime metrics in case the native module is not available. The metrics are aligned with the non-native ones besides having a standard deviation instead of a median. We could consider aligning this later.

szegedi

Mostly looks okay; some improvement ideas in the individual comments.

packages/dd-trace/src/runtime_metrics/runtime_metrics.js

In addition it does not propagate confusing Node.js event loop delay metrics (zero values).

…ion-metric

Things fixed / improved: * runtime.node.event_loop.utilization is now working as expected (the metric was formerly calculating the diff on each iteration on top of the diff instead of on the actual former value) * runtime.node.gc.pause* is now only sending data, if the metric contains a count > 0 * Stop tracking gc metrics on native side when it is deactivated to track those while event loop tracking is also deactivated * Slightly faster implementation for a couple of metrics * Implemented a best effort fallback mode for the event loop delay in case the native library can’t be loaded * Cleaning up code that was there for old unsupported Node.js versions * Created lots of tests that are now checking that values are in a correct range (almost all lines are now fully covered and the tests are significantly more robust about actually finding issues) * Refactored dogstatsd to use private properties instead of underscored * Improved client config in dogstatsd to only use a single loop instead of multiple filters * Made sure calling runtimeMetrics.start() multiple times without calling stop in-between will automatically clean up former tracking

BridgeAR added the semver-patch label Aug 26, 2025

BridgeAR added 4 commits August 27, 2025 01:49

test: make sure the runtime metrics are properly tested

89fa8a1

This also improves the code a bit

refactor: make dogstatd properties private and increase performance

1a74890

The config was using filters when they weren't needed. This is now a simple loop instead. The received config had to be truthy, otherwise it would have failed in the code afterwards.

BridgeAR changed the title ~~fix: runtime.node.event_loop.utilization is now handled correct~~ fix: improve runtime metrics Aug 27, 2025

BridgeAR added 2 commits August 28, 2025 16:02

Improve fallbacks and don't track metrics if not requested

6f585ea

Fix lint issues

a3f3eba

BridgeAR marked this pull request as ready for review August 28, 2025 14:11

BridgeAR requested a review from a team as a code owner August 28, 2025 14:11

BridgeAR requested a review from szegedi August 28, 2025 14:12

szegedi requested changes Aug 28, 2025

View reviewed changes

BridgeAR added 3 commits August 28, 2025 21:49

Use performance.now() and address other comments

0a7e6fc

Only measure the event loop delay by Node.js every 5 ms

9e7ac36

test: run tests with native and without native module

5fa89ca

szegedi previously approved these changes Aug 29, 2025

View reviewed changes

Make tests more robust and always run native and non-native tests

3984ca5

In addition it does not propagate confusing Node.js event loop delay metrics (zero values).

BridgeAR dismissed szegedi’s stale review via 3984ca5 August 29, 2025 10:22

BridgeAR and others added 6 commits August 29, 2025 06:23

Merge branch 'master' into BridgeAR/2025-08-25-fix-eventloop-utilizat…

39ae682

…ion-metric

test: make tests more robust on mac and linux

71f9961

test: make tests more robust on Linux and Mac

83da723

fixup!

0688136

fixup!

94eaf7f

Merge branch 'master' into BridgeAR/2025-08-25-fix-eventloop-utilizat…

eb5a4aa

…ion-metric

szegedi approved these changes Aug 29, 2025

View reviewed changes

BridgeAR merged commit 156f35a into master Aug 29, 2025
695 checks passed

BridgeAR deleted the BridgeAR/2025-08-25-fix-eventloop-utilization-metric branch August 29, 2025 16:29

dd-octo-sts bot mentioned this pull request Aug 30, 2025

v5.65.0 proposal #6356

Merged

mveroone mentioned this pull request Sep 27, 2025

fix(instrumentation-runtime-node): use absolute results in eventLoopUtilization computation open-telemetry/opentelemetry-js-contrib#3118

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: improve runtime metrics #6344

fix: improve runtime metrics #6344

Uh oh!

BridgeAR commented Aug 26, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Aug 26, 2025 •

edited

Loading

Uh oh!

codecov bot commented Aug 26, 2025 •

edited

Loading

Uh oh!

datadog-datadog-prod-us1 bot commented Aug 26, 2025 •

edited by datadog-official bot

Loading

Uh oh!

pr-commenter bot commented Aug 26, 2025 •

edited

Loading

Uh oh!

szegedi left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

fix: improve runtime metrics #6344

fix: improve runtime metrics #6344

Uh oh!

Conversation

BridgeAR commented Aug 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Aug 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overall package size

Uh oh!

codecov bot commented Aug 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

datadog-datadog-prod-us1 bot commented Aug 26, 2025 • edited by datadog-official bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚠️ Warnings

ℹ️ Info

Uh oh!

pr-commenter bot commented Aug 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmarks

Uh oh!

szegedi left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

BridgeAR commented Aug 26, 2025 •

edited

Loading

github-actions bot commented Aug 26, 2025 •

edited

Loading

codecov bot commented Aug 26, 2025 •

edited

Loading

datadog-datadog-prod-us1 bot commented Aug 26, 2025 •

edited by datadog-official bot

Loading

pr-commenter bot commented Aug 26, 2025 •

edited

Loading