-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Add additional logging to make spotting stats issues easier #133972
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add additional logging to make spotting stats issues easier #133972
Conversation
Pinging @elastic/es-data-management (Team:Data Management) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds debug and info level logging to the Probes
class to improve diagnostics for statistics collection issues. The logging helps track method invocations and failures during OS monitoring probes.
- Added logger initialization using Log4j
- Added debug logging at method entry and when method parameter is null
- Added info level logging for exception handling with method invocation details
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
onError = (String) valueMethod.invoke(onErrorObject); | ||
} catch (Exception ignored) {} | ||
} catch (Exception e) { | ||
logger.info("Error getting JVM info from MX Bean", e); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm trying to decide whether I agree with doing this at info
level.
Pro: I guess that it's only going to happen once on startup. Or maybe half a dozen times since it seems likely that if one of these fails then they all will. But it's all going to be out of the way on startup, anyway.
Con: These exceptions aren't going to be actionable by the typical user, and if they're okay with getting unknown values because of their JVM limitations then getting a bunch of stack traces logged every time they start up could be pretty noisy.
I think I'd err on the side of making all these debug
. If we get an SDH, we know how to turn the debug logging on, and we can also get a KB article added with the info. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(This applies to these changes, of course.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I back and forthed on this too. Looking at the code, it does seem that failure is normal and expected based on how the code swallows exceptions so I think you're right, debug is the right level for this. I'll get it switched.
server/src/main/java/org/elasticsearch/monitor/process/ProcessProbe.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for this and doing the extra work to future-proof it.
* upstream/main: Add additional logging to make spotting stats issues easier (elastic#133972) [ESQL] Clean up ESQL enrich landing page (elastic#134820) ES|QL: Make kibana docs for Query settings more consistent (elastic#134881) Add file extension metadata to cache miss counter from SharedBlobCacheService (elastic#134374) Add IT for num_reduced_phases with batched query execution (elastic#134312) Remove `SizeValue` (elastic#134871)
…133972) * Add additional logging to make spotting stats issues easier * Refactor logging in JVM monitoring classes for improved error handling * Switch to debug level logging
This came up after trying to diagnose a live issue. Additional logging in this class will make it easier to determine the root cause of stats issues going forward.