Replies: 2 comments 9 replies
-
@jsuereth Would love to hear your take on this and help me figure how to proceed to get a definitive answer on this and next steps if needed |
Beta Was this translation helpful? Give feedback.
-
I would like to ask whether there are API-level changes that can help with the performance issues directly as to accompany the above answer about optional SDK-implementation features. In early drafts of OTel we discussed "bound" metric instruments, those which allow the user to pre-process a large part of the metrics operation cost and avoid repeating that work. Can an Likewise, can a batch metrics update API help? If the operation was on a single attribute set and a batch of instrument values, would that help reduce allocations in your application? These two ideas can be complimentary. I could imagine that with the above-described API extensions, the SDK could be optimized to avoid the costs you're trying to avoid. |
Beta Was this translation helpful? Give feedback.
-
Hi,
I'm working on replacing several metrics libraries with OpenTelemetry Metrics in Apache Pulsar. Since it is a very latency-sensitive system, a lot of effort went into minimizing memory allocations where possible. I've read OTel Metrics Java SDK code, and it seems that in certain conditions (i.e., 100k topics and above) the amount of memory allocations cost would be severe.
Hence, I've submitted a proposal to enhance the SDK such that it will use a different data structure for relaying the metric points - i.e. using the concept of streaming instead of batching in memory (list) thus minimizing memory allocations to bare minimum. I'll describe in short the bare-bones suggestion here, and you can read it more thoroughly in the GitHub issue, including the context of Apache Pulsar in more detail.
We'll add two new interfaces:
We'll use them to have the exporter support streaming (and batching), and upon creation you choose the export method.
The
MetricsProducer
will be modified to support that:The specification for
MetricsExporter.export(batch)
says:The main idea here is that we use a different data structure to produce and export those batch of metrics.
The benefit and goal (outlined in detail in the original proposal) is to reduce any overhead cost in terms of CPU and memory the OTel Java SDK bears on the host application. I believe this should be the goal of any SDK.
My question is:
Does adding this ability to receive the batch as streaming (different data structure) follows the specifications and allowed?
Beta Was this translation helpful? Give feedback.
All reactions