Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[OTLP Registry] Timestamps are exported without seconds/milliseconds when temporality is DELTA #5041

Closed
joaopgrassi opened this issue May 6, 2024 · 11 comments
Labels
question A user question, probably better suited for StackOverflow

Comments

@joaopgrassi
Copy link

joaopgrassi commented May 6, 2024

Describe the bug

When configuring the OTLP registry with Delta temporality, the exported metrics have timestamps with "full minutes" - no seconds/milliseconds. I believe this is a bug because we are actually losing information when converting OTLP delta metrics.

Example with cumulative:

public static void main(String[] args) {
    OtlpConfig otlpConfig = new OtlpConfig() {
        @Override
        public String get(final String key) {
            return null;
        }
        @Override
        public AggregationTemporality aggregationTemporality() {
            return AggregationTemporality.CUMULATIVE;
        }
    };

    MeterRegistry registry = new OtlpMeterRegistry(otlpConfig, Clock.SYSTEM);
    Metrics.addRegistry(registry);

    Counter counter = registry.counter("page.visitors", "foo", "bar");
    counter.increment(2.0);

    registry.close();
}

Produces this OTLP Data (Collector output)

otel-local-collector-1  | Descriptor:
otel-local-collector-1  |      -> Name: page.visitors
otel-local-collector-1  |      -> Description:
otel-local-collector-1  |      -> Unit:
otel-local-collector-1  |      -> DataType: Sum
otel-local-collector-1  |      -> IsMonotonic: true
otel-local-collector-1  |      -> AggregationTemporality: Cumulative
otel-local-collector-1  | NumberDataPoints #0
otel-local-collector-1  | Data point attributes:
otel-local-collector-1  |      -> foo: Str(bar)
otel-local-collector-1  | StartTimestamp: 2024-05-06 09:40:39.192 +0000 UTC
otel-local-collector-1  | Timestamp: 2024-05-06 09:40:39.213 +0000 UTC
otel-local-collector-1  | Value: 2.000000

Example with Delta:

public static void main(String[] args) {
    OtlpConfig otlpConfig = new OtlpConfig() {
        @Override
        public String get(final String key) {
            return null;
        }
        @Override
        public AggregationTemporality aggregationTemporality() {
            return AggregationTemporality.DELTA;
        }
    };

    MeterRegistry registry = new OtlpMeterRegistry(otlpConfig, Clock.SYSTEM);
    Metrics.addRegistry(registry);

    Counter counter = registry.counter("page.visitors", "foo", "bar");
    counter.increment(2.0);

    registry.close();
}

Produces this OTLP Data (Collector output) - Seconds are all 0s

otel-local-collector-1  | Descriptor:
otel-local-collector-1  |      -> Name: page.visitors
otel-local-collector-1  |      -> Description:
otel-local-collector-1  |      -> Unit:
otel-local-collector-1  |      -> DataType: Sum
otel-local-collector-1  |      -> IsMonotonic: true
otel-local-collector-1  |      -> AggregationTemporality: Delta
otel-local-collector-1  | NumberDataPoints #0
otel-local-collector-1  | Data point attributes:
otel-local-collector-1  |      -> foo: Str(bar)
otel-local-collector-1  | StartTimestamp: 2024-05-06 09:42:00 +0000 UTC
otel-local-collector-1  | Timestamp: 2024-05-06 09:43:00 +0000 UTC
otel-local-collector-1  | Value: 2.000000

Environment

  • Micrometer version: 1.12.5
  • Micrometer registry: OTLP
  • OS: Ubuntu
  • Java version: openjdk 11.0.11 2021-04-20

To Reproduce
Example app code as above

Expected behavior
When configured with Delta timestamps should also contain seconds and milliseconds

Additional context

I looked at the code, and I noticed that in this line https://github.com/micrometer-metrics/micrometer/blob/main/implementations/micrometer-registry-otlp/src/main/java/io/micrometer/registry/otlp/OtlpMetricConverter.java#L67

 this.deltaTimeUnixNano = (clock.wallTime() / step.toMillis()) * step.toNanos();

It divides the wall time by the step (which is 1 min by default I believe), which probably is causing the "truncation" of the timestamps. Was there any reason why we want the timestamp for delta to align with the export interval? I imagine we could just take the wall time and transform it to nanoseconds?

@lenin-jaganathan
Copy link
Contributor

lenin-jaganathan commented May 6, 2024

This is expected and yes it is by design. The OtlpMeterRegistry in delta mode uses StepMeter implementation which collects data for the step duration. (Gauge should be a exception for this but I see that is not the case for which I will file a follow-up issue)
E.g: (time is in format HH:MM:SS)

  • If the step is 1 minute 60s, the data will be collected for 00:00:00 to 00:01:00, 00:01:00-00:02:00
  • If the step is 10 seconds, the data will be collected for 00:00:00 to 00:00:10, 00:00:10-00:00:20
  • If the step is 500ms, the data will be collected for 00:00:00.0 to 00:00:00.5, 00:00:00.5 to 00:00:01.0

Are you saying that when your step is finer and still not getting ms level accuracy?

Was there any reason why we want the timestamp for delta to align with the export interval?

Yes. This is particularly helpful in a distributed environment where 'N' different instances send data to the back-end. When plotting the charts, if all the data is for the same duration(i,e interval), it becomes straightforward to make correlations. As opposed if they are randomized, the data correlation can be off upto 2 step lengths.

@jonatan-ivanov jonatan-ivanov added waiting for feedback We need additional information before we can continue and removed waiting-for-triage labels May 6, 2024
@pirgeo
Copy link
Contributor

pirgeo commented May 7, 2024

As opposed if they are randomized, the data correlation can be off upto 2 step lengths.

Isn't that always the case with every StepMeterRegistry?

@lenin-jaganathan
Copy link
Contributor

Isn't that always the case with every StepMeterRegistry?

No. The micrometer StepMeterRegistries are always aligned with the step.

@pirgeo
Copy link
Contributor

pirgeo commented May 7, 2024

No. The micrometer StepMeterRegistries are always aligned with the step.

I might be missing something very obvious. The reason the timestamps are truncated in the OTLP registry is so that different instances of Micrometer running in different places always align to the same "time buckets". This is not automatically the case, since the Micrometer instances start at different times. Then they export at the same interval, but these intervals are very unlikely to align, especially if you have multiple instances. This is what this calculation addresses, right? Wouldn't that problem be the same if you used any other StepMeterRegistry? Is there a reason this is worse for the OTLP registry?

@lenin-jaganathan
Copy link
Contributor

lenin-jaganathan commented May 7, 2024

Maybe I was a bit unclear with my explanations.

What I meant is - All the step registries exhibit the same behavior i,e they collect data for the same step window (let's say the step is 1 minute, every instance would collect data from the start of any minute to the end of any minute). Of course, the exceptions are app start and end minutes where this will be an incomplete window.

The collected data would be exported randomly over the course of the next step(minute in this case). This is to avoid all the instances flooding the backend with requests exactly at the same moment(typically at the start of the minute). But the data would be exported would be for a full step and the timestamps would give you that information. OTLP has 2 fields for timestamp, start time, and timeUnix which are used to represent this window. Maybe the confusion arose because the the data for 00:00:00-00:00:01 was not sent immediately at 00:00:01. Rather it is sent between 00:00:01-00:00:02, but this is more of when the data is exported.

Related issues,

@pirgeo
Copy link
Contributor

pirgeo commented May 7, 2024

I think there are a couple of things that confuse me here. Let me try to clarify how I understand it works and then correct what I misunderstand:

  1. The app starts. The MeterRegistry is instantiated. The ThreadFactory is started with the step parameter (lets say 1minute for this example). This is not aligned with any timestamp, let's say the app starts at 00:00:15. Lets also say the initial delay is randomly set at 15s, so the first export will happen at 00:00:15 + 15s initial delay + 1m step = 00:01:30. Thereafter, data will be exported every minute, so at 00:02:30, 00:03:30, 00:04:30 and so on.
  2. The user uses the Micrometer API to record some data. Let's say they record every 10s and for simplicity lets say it happens on the dot, so 00:00:10, 00:00:20, 00:00:30, etc.
  3. The first export starts at 00:01:30. It will include all the data recorded in the timeframe [00:00:00, 00:01:30]. It will then put all of that data into an OTLP message with timestamps startTime=00:00:00; time=00:01:00.
  4. Data continues to be recorded (at 00:01:40, 00:01:50, etc.)
  5. The second export starts. It will contain all data from ]00:01:30, 00:02:30]. It will export that data in an OTLP message with timestamps startTime=00:01:00; time=00:02:00.

I think the confusion stems from the fact that to force the alignment to a certain time bucket in the export structure (i.e. the fact that the OTLP message timestamps are always truncated by using the step size) you must force the export to happen on that same schedule. (That would defeat the purpose of spacing them out to reduce load, because all exports for all instances would happen at 00:01:30, 00:02:30 and so on)

What I would expect to happen is the following:

  1. Same as above, exports will happen at 00:01:30, 00:02:30, 00:03:30 and so on.
  2. As above, data will be recorded at 00:00:10, 00:00:20, 00:00:30, and so on.
  3. The first export starts. It will contain data from 00:00:00 to 00:01:30. timestamps: startTime=00:00:00 time=00:01:30
  4. Data continues to be recorded, as above.
  5. The second export starts. It will contain data from 00:01:30 to 00:02:30. timestamps: startTime=00:01:30 time=00:02:30

Now, the timestamps actually align with when the data was collected. This gives the backend a chance to determine how to book that data. Today, this decision is made on the client side, and there is no way for the backend to know that the data that arrives with certain timestamps doesn't actually come from that timeframe.

I understand that this means that the start/end timestamps will differ for each running app instance. However, the timestamps created today don't actually represent when the data was collected unless I am missing the step where the data recorded in Micrometer is actually aligned to the bucket boundaries that are exported in the OTLP message. Recordings are put into the timestamp range, even if they were not recorded in that timestamp range (i.e. the data recorded at 00:02:10 will be put into the 00:01:00 to 00:02:00 bucket).

@lenin-jaganathan
Copy link
Contributor

You are partially right. The publishing part is more or less what you said. But the key differences are as follows (I am assuming that the initial calculated delay is 30 seconds as mentioned in your example),

The first export starts at 00:01:30. It will include all the data recorded in the timeframe [00:00:00, 00:01:30]. It will then put all of that data into an OTLP message with timestamps startTime=00:00:00; time=00:01:00.

The first export will only contain data for (00:00:00, 00:01:00] with timestamps startTime=00:00:00; time=00:01:00. (But the export starts at 00:01:30)

The second export starts. It will contain all data from ]00:01:30, 00:02:30]. It will export that data in an OTLP message with timestamps startTime=00:01:00; time=00:02:00.

The second export will only contain data for (00:01:00, 00:02:00] and it will export with timestamp startTime=00:01:00; time=00:02:00. (But the export starts at 00:02:30)

Hope the above example makes sense. Now this addresses all the problems discussed here,

  • All apps are reporting data for the same window (start of step to end).
  • The data is sent to the backends in a more randomized manner from multiple instances.
  • The back end gets the exact time window of the data since the start and end timestamps are set accordingly.

@pirgeo
Copy link
Contributor

pirgeo commented May 7, 2024

I don't understand how it is ensured that the value that is polled from the Micrometer instrument at 00:01:30 only contains data for 00:00:00 to 00:01:00.

@lenin-jaganathan
Copy link
Contributor

Got it. This is done by a polling service in StepMeterRegistry (for OTLP this is slightly different on how histograms are handled but they follow the same concept).

There might be a skew of a few initial milliseconds based on the number of meters in the registry but this will be very very minimal in the order of a few microseconds per meter.

@pirgeo
Copy link
Contributor

pirgeo commented May 7, 2024

Ah, there is a completely separate background thread and loop here; I missed that. So, at the beginning of each 1-minute bucket (e.g., at 00:01:00 + 1ms), the pollMetersToRollover method is called, which resets the StepMeters. And the export loop gets the values for the last completed interval because it uses poll which returns the value for the last completed interval.

And the StepValue also separately calculates the beginning of the export interval when it is initialized.

Thanks for explaining that; it wasn't straightforward for me.

@joaopgrassi
Copy link
Author

Ok it seems then it is all working as expected. Feel free to close this issue then. Thanks!

@jonatan-ivanov jonatan-ivanov added question A user question, probably better suited for StackOverflow and removed waiting for feedback We need additional information before we can continue labels May 7, 2024
@jonatan-ivanov jonatan-ivanov closed this as not planned Won't fix, can't repro, duplicate, stale May 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question A user question, probably better suited for StackOverflow
Projects
None yet
Development

No branches or pull requests

4 participants