-
Notifications
You must be signed in to change notification settings - Fork 130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor & merge JMX gatherer & insight implementations #1362
Comments
Additionally, there are similar agents which does the same thing.
Both of which seems to be more mature than the current OTeL JMX gatherer and insight agents.
|
Thanks @shivanshuraj1333 , I agree with you for Prometheus JMX exporter and Jolokia, those might be more mature but from what I know about them they fit a "pull-based metrics" model as they expose metrics to HTTP/prometheus, here the goal is to fit the "push-based metrics" model where the metrics are sent to the collector, not pulled by it. Also, one of the features of Jolokia is to expose the whole (or parts of) JMX interface, which has lots of security implications, so having a dedicated implementation limited to capturing metrics makes sense. Those two suggestions could really be worth adding as follow-up improvements:
For the wildcards part, do you know if (and how) they are documented in Prometheus ? While it's just a plain mapping of the JMX tree, building a reference of the metrics for building reports might be tricky without knowledge on the implementation. |
I see,
But I think today it's a mixture in OTeL, if I use OTeL metrics gatherer as a receiver in collector it becomes
Though it's not documented but looking at the code the flow (on a very high level), looks like this, - pattern : kafka.server<type=(.+), name=(.+), clientId=(.+), topic=(.+), partition=(.*)><>Value
name: kafka_server_$1_$2
type: GAUGE
labels:
clientId: "$3"
topic: "$4"
partition: "$5" Now when the JMXCollector.java initiates, it captures the rules from pattern, parses the rules into regex and puts it into a rule cache. When the scraper initiates, it queries MBeans names and puts that into jmxMBeanPropertyCache, ref (this basically contains all the MBeans that are collected), now when the scraper processes the recorded bean values, it checks the recorded beans against the rules cache it created above, ref, and puts them in matchedRules cache, which is then used for creating the final metric snapshot. cc'ing @dhoard who added some recent commits, and may have some ideas to share... |
I have one more suggestion: should we consider making the JMX receiver in the collector compatible with any supported JMX agent? This would involve significant work, as the configuration would need to be parsed for each supported agent (e.g., Prometheus, Jolokia, OTeL metrics gatherer). Additionally, the receiver would need to internally convert collected metrics into OTLP format, such as converting Prometheus metrics into Prometheus format. The advantage is that OTeL users could utilize any supported JMX collector. However, the downside is the high maintenance required for the JMX receiver. |
@shivanshuraj1333 that sounds correct. FYI - The next Prometheus JMX Exporter release will support OTel push (along with the current Prometheus pull-based model.) |
Thanks @dhoard for chiming in.
|
High level...
|
I see, that makes sense, probably the merger of OTeL insight and metrics gatherer would have to face the similar challenges, if we plan to support multiple JMX endpoint.
So, if you look at JMX receiver in the collector, it only supports JMX_metrics_gatherer agent from OTeL. If we tweak the config of the collector to support any generic agent, the JMX receiver can be extended to support any other agent (like prometheus agent). This would give some luxury to the end user if they want to use newly supported vendor specific agents (like JMX exporter from prometheus). |
Thanks for the clarification! For an application that has a proprietary metrics API that doesn't implement the Prometheus/OpenMetrics exposition format or OTel metrics push, I could see where it could be used, but I feel writing an application-specific collector is the best approach. |
So if I'm following correctly, we have:
So once prometheus/jmx_exporter#981 is merged we have a potential overlap between OTel JMX gatherer and Prometheus exporter as both can be used to export to OTel, and adding support for Prometheus JMX exporter in OTel collector seems a good addition. The goal here is to merge 1) and 2) implementations so they behave the same way and can capture identical metrics, those two address the need to be able to capture JMX metrics from within and from outside of the JVM. For me, the fact that JMX Gatherer could potentially be replaced by Prometheus JMX exporter is nice beyond the scope of this refactoring. Also, when collecting JMX metrics from outside of the JVM (with OTel Gatherer or Prometheus exporter), it seems that trying to connect to multiple JVMs provides some extra complexity if we have to deal with the RMI security/authentication. Maybe it's worth discussing this in a separate dedicated issue, but is there a very large impact to having a small JVM process for the gatherer/exporter run by the collector per monitored remote JVM ? |
Yes, you are correct!
On the other hand, the JMX receiver on the collector helps in facilitating running the gatherer jar in the collector itself, but should allow running other JARs as well (like JMX_exporter from prometheus). On the collector side, we can create a unified config translation for respective JARs. (and that is what actually it is doing today, but somewhat hardcoded for gatherer).
+1 to this, if we compare today, in all the cases JMX gatherer can be replaced by Prometheus JMX exporter, but not in collector. As a user if I'm running things on scale, I'd prefer using Prometheus JMX exporter, based on current implementation of gatherer. AFAIU, the Prometheus JMX exporter is single threaded, IMO to support metrics collections from multiple JMX servers, if we make it multithreaded where each thread is responsible for corresponding RMI belonging to that particular JMX server, I think it can work without the security concerned raised above. |
Particular to this point, @dhoard WDYT? This is mostly relevant when running as a standalone server and not as agent. |
The Prometheus JMX Exporter...
As a data point, in the past, Maven Central statistics show that the Prometheus JMX Exporter Java agent is primarily being used (by a huge margin) over the standalone exporter. |
As discussed in June 20th Java SIG meeting, we have two distinct ways to capture JMX metrics:
As a consequence, we have:
In addition to the duplication of code, the metrics definition themselves are duplicated in two different incompatible formats (YAML and Groovy scripts), which makes the maintenance even harder and further complicates their evolution. For example trying to change definitions in instrumentation like in opentelemetry-java-instrumentation#11621 becomes more complicated than it should.
JMX gatherer allows to capture JMX metrics from outside of the JVM and is used in the collector-contrib repository. Usages of it can be found by searching for usages of
otel.jmx.service.url
on GH.JMX gatherer is used in the opentelemetry-collector-contrib repository by the
jmxreceiver
Proposal
Removing the user-provided groovy script support is the only breaking change here, the rationale is:
otel.jmx.groovy.script
inopentelemetry-collector-contrib
was removed 2 years ago for security reasons (opentelemetry-collector-contrib#6750 and opentelemetry-collector-contrib#9721)otel.jmx.groovy.script
on GH returns mostly point to old forks of theopentelemetry-collector-contrib
, so we can infer this feature is probably rarely usedAll of this assumes that there are no limitations of the YAML metrics definition to represent the current groovy script definitions. In case of unsolvable incompatibilities we will have to decide on a case-by-case basis.
Implementation steps
The current path to implement this brings a few breaking changes so we decided to create a new "JMX Scraper" component that will provide an alternative to "JMX Gatherer" for an easier migration.
tomcat
with definitions that matchtomcat.groovy
Reuse JMX Insights in Scraper + Tomcat support #1485The text was updated successfully, but these errors were encountered: