-
Notifications
You must be signed in to change notification settings - Fork 238
fix: ensure metrics collection is actually disabled with metricsInterval=0s #2330
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
💚 Build Succeeded
Expand to view the summary
Build stats
Test stats 🧪
Trends 🧪 |
…val=0s Also guard against: - `metricsInterval:null` resulting in metrics being enabled without any log message, which is perhaps slightly surprising. - `metricsInterval:-1s` (any negative value) resulting in metrics being collected *as fast as possible* via `setInterval(collectAllTheMetrics, -1000)`. An invalid value for this config var, and the other "POSTIVE_TIME_OPTS", will result in (a) a log.warning and (b) falling back to the default value. A slight semantic change on the internal Metrics object means that the `getOrCreate...()` metric functions now return undefined when metrics are disabled via `metricsInterval`. This doesn't affect any public API.
8fa97e4 to
a6f562a
Compare
astorm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Overall looks like a positive incremental improvement of the agent. A few questions/suggestions below but I'm comfortable with a merge as is. Approving.
| logger: new NoopLogger() | ||
| } | ||
| }) | ||
| if (enabled) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 All methods in this class have a guard clause that should prevent Cannot read property .. of undefined errors, and the symbol effectively makes this a private property.
| normalizeNumbers(opts) | ||
| normalizeBytes(opts) | ||
| normalizeArrays(opts) | ||
| normalizePositiveTime(opts, logger) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 I ran some tests and this function appears to behave as expected -- non-positive and null/undefined values use the values from DEFAULTS.
|
Failure in GH Action CI: is due to the inherent race in disable-send.test.js with possibly-slow CI: // Test captureError speed as a proxy for testing that it avoids
// stacktrace collection when disableSend=true. It isn't a perfect way
// to test that.
const durationMs = duration[0] / 1e3 + duration[1] / 1e6
const THRESHOLD_MS = 3 // Is this long enough for slow CI?
t.ok(durationMs < THRESHOLD_MS, `captureError is fast (<${THRESHOLD_MS}ms): ${durationMs}ms`)Just retrying for now. |
I'm taking that as a +1. |
Also guard against:
metricsInterval:nullresulting in metrics being enabled without anylog message, which is perhaps slightly surprising.
metricsInterval:-1s(any negative value) resulting in metrics beingcollected as fast as possible via
setInterval(collectAllTheMetrics, -1000).An invalid value for this config var, and the other "POSTIVE_TIME_OPTS",
will result in (a) a log.warning and (b) falling back to the default
value.
A slight semantic change on the internal Metrics object means that the
getOrCreate...()metric functions now return undefined when metricsare disabled via
metricsInterval. This doesn't affect any public API.Repro
Run the above script, then do at least one request via
curl -i localhost:3000/.The agent's internal
Metricsobject will haveenabled === falseand calls:The
enabled: falsethere is just for ourMetricsRegistryclass.Internally the
Reporterfrom measured-reporting gets thedefaultReportingIntervalInSeconds: 0and does this:hence defaulting to its internal 10s interval. Then the first time
<reporter>.reportMetricOnInterval(metricKey)is called (which happens for breakdown metrics on thatcurl ...call above), thatdefaultReportingIntervalInSeconds: 10is used in asetIntervalto call metrics collectors.We don't report any metrics to APM server because of this in MetricsRegistry:
However, collection can still be happening, which is work that shouldn't be done.
Repro metricsInterval=-1s
With a negative value you get approaching 100% CPU usage as system and runtime metrics are collected in a setInterval with a negative delay -- which means run as fast as possible (subject to a JS engine-defined minimum delay).
Checklist