Skip to content

Conversation

lukewhiting
Copy link
Contributor

@lukewhiting lukewhiting commented Sep 2, 2025

This came up after trying to diagnose a live issue. Additional logging in this class will make it easier to determine the root cause of stats issues going forward.

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds debug and info level logging to the Probes class to improve diagnostics for statistics collection issues. The logging helps track method invocations and failures during OS monitoring probes.

  • Added logger initialization using Log4j
  • Added debug logging at method entry and when method parameter is null
  • Added info level logging for exception handling with method invocation details

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

onError = (String) valueMethod.invoke(onErrorObject);
} catch (Exception ignored) {}
} catch (Exception e) {
logger.info("Error getting JVM info from MX Bean", e);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to decide whether I agree with doing this at info level.

Pro: I guess that it's only going to happen once on startup. Or maybe half a dozen times since it seems likely that if one of these fails then they all will. But it's all going to be out of the way on startup, anyway.

Con: These exceptions aren't going to be actionable by the typical user, and if they're okay with getting unknown values because of their JVM limitations then getting a bunch of stack traces logged every time they start up could be pretty noisy.

I think I'd err on the side of making all these debug. If we get an SDH, we know how to turn the debug logging on, and we can also get a KB article added with the info. WDYT?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(This applies to these changes, of course.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I back and forthed on this too. Looking at the code, it does seem that failure is normal and expected based on how the code swallows exceptions so I think you're right, debug is the right level for this. I'll get it switched.

Copy link
Member

@PeteGillinElastic PeteGillinElastic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for this and doing the extra work to future-proof it.

@lukewhiting lukewhiting enabled auto-merge (squash) September 17, 2025 10:54
@lukewhiting lukewhiting merged commit f6967fe into elastic:main Sep 17, 2025
34 checks passed
szybia added a commit to szybia/elasticsearch that referenced this pull request Sep 17, 2025
* upstream/main:
  Add additional logging to make spotting stats issues easier (elastic#133972)
  [ESQL] Clean up ESQL enrich landing page (elastic#134820)
  ES|QL: Make kibana docs for Query settings more consistent (elastic#134881)
  Add file extension metadata to cache miss counter from SharedBlobCacheService (elastic#134374)
  Add IT for num_reduced_phases with batched query execution (elastic#134312)
  Remove `SizeValue` (elastic#134871)
gmjehovich pushed a commit to gmjehovich/elasticsearch that referenced this pull request Sep 18, 2025
…133972)

* Add additional logging to make spotting stats issues easier

* Refactor logging in JVM monitoring classes for improved error handling

* Switch to debug level logging
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Data Management/Stats Statistics tracking and retrieval APIs >non-issue Team:Data Management Meta label for data/management team v8.18.8 v8.19.5 v9.0.8 v9.1.5 v9.2.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants