Skip to content

opentelemetry: JSON Metrics and Exp Histograms#11456

Merged
edsiper merged 16 commits intomasterfrom
opentelemetry-metrics-json
Feb 16, 2026
Merged

opentelemetry: JSON Metrics and Exp Histograms#11456
edsiper merged 16 commits intomasterfrom
opentelemetry-metrics-json

Conversation

@edsiper
Copy link
Member

@edsiper edsiper commented Feb 12, 2026

  • Add internal opentelemetry interface to support JSON metrics
  • in_opentelemetry: cleanup and refactor handling of signals
  • CMetrics v2.0.0:
    • Implemented typed scalar storage for samples (double, int64, uint64) to avoid precision loss for large integers.
    • MsgPack ABI remains backward-compatible:
      • legacy value still supported
      • new typed fields added for exact integer preservation.
    • Added native exponential histogram support across cmetrics core paths and encoders/decoders.
    • OTLP exponential histogram coverage improved end-to-end (decode/encode and roundtrip tests).
    • Text/debug output now includes richer exponential histogram details and exemplars.
    • OTLP metadata handling expanded (resource/scope containers and per-point metadata propagation).
    • Fixed MsgPack decoder hardening issues (bounds/structure checks) for malformed summary/histogram arrays.
    • Added standalone MsgPack ABI/security tests plus expanded OTLP roundtrip tests.

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

Summary by CodeRabbit

  • New Features

    • Exponential histogram support added across encoders/exporters, decoding, and OTLP paths.
    • OpenTelemetry: improved content-type detection, richer OTLP JSON decoding with metadata, exemplars, and JSON-to-metric conversion utilities.
    • Metrics: explicit value types (double, int64, uint64) and atomic snapshot access.
  • Tests

    • New comprehensive exponential-histogram, OpenTelemetry metrics, and MessagePack ABI test suites and test data.
  • Chores

    • CMetrics version bumped to 2.0.

Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
…ogs/traces

  - Added shared content-type/payload helpers in plugins/in_opentelemetry/opentelemetry_utils.{h,c}:
      - opentelemetry_is_json_content_type
      - opentelemetry_is_protobuf_content_type
      - opentelemetry_is_grpc_content_type
      - opentelemetry_payload_starts_with_json_object
  - Switched logs/traces/metrics paths to use shared helpers:
      - plugins/in_opentelemetry/opentelemetry_logs.c
      - plugins/in_opentelemetry/opentelemetry_traces.c
      - plugins/in_opentelemetry/opentelemetry_prot.c
  - Removed duplicated local gRPC content-type checker from opentelemetry_prot.c.
  - Improved coherence across signals:
      - unified JSON/protobuf/gRPC detection
      - accepts content-type params (e.g. application/json; charset=utf-8)
      - consistent JSON payload prefix validation for logs/traces.

Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
@edsiper edsiper requested a review from cosmo0920 as a code owner February 12, 2026 14:09
@edsiper edsiper added this to the Fluent Bit v5.0 milestone Feb 12, 2026
@edsiper edsiper changed the title opentelemetry: add JSON Metrics support (plus others) opentelemetry: JSON Metrics and Exp Histograms Feb 12, 2026
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 088238d202

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
@coderabbitai
Copy link

coderabbitai bot commented Feb 12, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds Exponential Histogram support across cmetrics (new type, storage, encode/decode, merge/filter), typed metric value APIs, OTLP JSON→CMT and msgpack→CFL conversion utilities, centralized OpenTelemetry content-type predicates, many tests and test-data, and a cmetrics version bump.

Changes

Cohort / File(s) Summary
Core exp histogram implementation
lib/cmetrics/include/cmetrics/cmt_exp_histogram.h, lib/cmetrics/src/cmt_exp_histogram.c
New public type and implementation for exponential histograms: create, populate (set_default), destroy, and convert-to-explicit-buckets.
Public API & core struct updates
lib/cmetrics/include/cmetrics/cmetrics.h, lib/cmetrics/include/cmetrics/cmt_cat.h, lib/cmetrics/include/cmetrics/cmt_metric.h, lib/cmetrics/include/cmetrics/cmt_decode_msgpack.h
Added CMT_EXP_HISTOGRAM, forward decl for cmt_exp_histogram, new cmt fields, value-type enum/fields, and prototype for cmt_cat_exp_histogram.
Metric typed-value accessors
lib/cmetrics/src/cmt_metric.c, lib/cmetrics/include/cmetrics/cmt_metric.h
New setters/getters/snapshot for double/int64/uint64; internal value-type tracking; changed metric_exchange signature.
MsgPack & OTLP decoders
lib/cmetrics/src/cmt_decode_msgpack.c, lib/cmetrics/src/cmt_decode_opentelemetry.c
Decode support for EXP_HISTOGRAM, typed-value unpacking (int64/uint64), per-point metadata/exemplars, stricter validation and error handling.
Encoders (all formats)
lib/cmetrics/src/cmt_encode_msgpack.c, lib/cmetrics/src/cmt_encode_prometheus.c, lib/cmetrics/src/cmt_encode_influx.c, lib/cmetrics/src/cmt_encode_cloudwatch_emf.c, lib/cmetrics/src/cmt_encode_splunk_hec.c, lib/cmetrics/src/cmt_encode_prometheus_remote_write.c, lib/cmetrics/src/cmt_encode_opentelemetry.c, lib/cmetrics/src/cmt_encode_text.c
Emit EXP_HISTOGRAMs (native or converted to explicit histogram), include typed values when present, serialize exemplars/metadata; added cleanup and conversion helpers.
Merge/filter/map/utils
lib/cmetrics/src/cmt_cat.c, lib/cmetrics/src/cmt_filter.c, lib/cmetrics/src/cmt_map.c, lib/cmetrics/include/cmetrics/cmt_mpack_utils.h, lib/cmetrics/src/cmt_mpack_utils.c, lib/cmetrics/include/cmetrics/cmt_variant_utils.h
New cmt_cat_exp_histogram and helpers; filter/concat now handle exp_histograms; msgpack int-tag consumer and variant support for uint64/null; memory cleanup updates.
cmetrics build & lifecycle
lib/cmetrics/CMakeLists.txt, lib/cmetrics/src/CMakeLists.txt, lib/cmetrics/src/cmetrics.c
CMT version bump, add cmt_exp_histogram.c to build, conditional -lm linkage for non-MSVC, init/destroy of exp_histograms list.
OTLP JSON/msgpack helpers
include/fluent-bit/flb_opentelemetry.h, src/opentelemetry/flb_opentelemetry_utils.c, src/CMakeLists.txt
New header include and declarations plus implementation for recursive JSON/msgpack→CFL-variant conversion and flb_opentelemetry_metrics_json_to_cmt + attribute cloning helpers.
OpenTelemetry plugin content-type helpers
plugins/in_opentelemetry/opentelemetry_utils.h, plugins/in_opentelemetry/opentelemetry_utils.c, plugins/in_opentelemetry/opentelemetry_prot.c, plugins/in_opentelemetry/opentelemetry_logs.c, plugins/in_opentelemetry/opentelemetry_traces.c
Centralized content-type predicates (JSON/Protobuf/gRPC), payload JSON-object detector, and routing of JSON metrics to new JSON→CMT converter; updated call-sites to pass content_type/session.
Tests & test-data
lib/cmetrics/tests/*, tests/internal/*, tests/internal/data/opentelemetry/metrics.json
Large new/updated tests: exp_histogram unit tests, msgpack ABI tests, comprehensive OpenTelemetry metrics tests and JSON test-cases; test list additions and small expected-payload adjustments.
Small wiring & headers
include/fluent-bit/flb_opentelemetry.h, plugins/in_opentelemetry/..., src/...
Header include additions and minor signature changes (e.g., process_payload_metrics now takes content_type/session), plus new prototypes for OTLP JSON utilities.

Sequence Diagram(s)

mermaid
sequenceDiagram
participant Client
participant In_OTLP_Plugin as Plugin
participant flb_opentelemetry_utils as flb_utils
participant cmetrics as CMetrics
Client->>Plugin: POST metrics (body + Content-Type)
Plugin->>Plugin: detect content-type (JSON / protobuf / gRPC)
alt JSON
Plugin->>flb_utils: flb_opentelemetry_metrics_json_to_cmt(body, len)
flb_utils->>flb_utils: parse JSON → msgpack/CFL variants, clone attributes
flb_utils->>CMetrics: populate cmt contexts/maps/metrics (incl. exp_histograms, metadata)
else Protobuf/grpc
Plugin->>CMetrics: cmt_decode_opentelemetry_create(msgpack payload)
end
CMetrics-->>Plugin: success / error
Plugin-->>Client: HTTP response

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related PRs

Suggested reviewers

  • cosmo0920
  • koleini
  • pwhelan

Poem

🐰 I hopped through buckets, scale, and sum,
I stitched JSON to metrics, one by one,
From plugin to CMT the pipelines hum,
New types, tests, and code — a joyous run,
Carrots for CI, the patch is done! 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 2
❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 4.02% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Merge Conflict Detection ⚠️ Warning ❌ Merge conflicts detected (41 files):

⚔️ cmake/libbacktrace.cmake (content)
⚔️ include/fluent-bit/flb_opentelemetry.h (content)
⚔️ lib/cmetrics/CMakeLists.txt (content)
⚔️ lib/cmetrics/include/cmetrics/cmetrics.h (content)
⚔️ lib/cmetrics/include/cmetrics/cmt_cat.h (content)
⚔️ lib/cmetrics/include/cmetrics/cmt_decode_msgpack.h (content)
⚔️ lib/cmetrics/include/cmetrics/cmt_encode_opentelemetry.h (content)
⚔️ lib/cmetrics/include/cmetrics/cmt_metric.h (content)
⚔️ lib/cmetrics/include/cmetrics/cmt_mpack_utils.h (content)
⚔️ lib/cmetrics/include/cmetrics/cmt_variant_utils.h (content)
⚔️ lib/cmetrics/src/CMakeLists.txt (content)
⚔️ lib/cmetrics/src/cmetrics.c (content)
⚔️ lib/cmetrics/src/cmt_cat.c (content)
⚔️ lib/cmetrics/src/cmt_decode_msgpack.c (content)
⚔️ lib/cmetrics/src/cmt_decode_opentelemetry.c (content)
⚔️ lib/cmetrics/src/cmt_encode_cloudwatch_emf.c (content)
⚔️ lib/cmetrics/src/cmt_encode_influx.c (content)
⚔️ lib/cmetrics/src/cmt_encode_msgpack.c (content)
⚔️ lib/cmetrics/src/cmt_encode_opentelemetry.c (content)
⚔️ lib/cmetrics/src/cmt_encode_prometheus.c (content)
⚔️ lib/cmetrics/src/cmt_encode_prometheus_remote_write.c (content)
⚔️ lib/cmetrics/src/cmt_encode_splunk_hec.c (content)
⚔️ lib/cmetrics/src/cmt_encode_text.c (content)
⚔️ lib/cmetrics/src/cmt_filter.c (content)
⚔️ lib/cmetrics/src/cmt_map.c (content)
⚔️ lib/cmetrics/src/cmt_metric.c (content)
⚔️ lib/cmetrics/src/cmt_mpack_utils.c (content)
⚔️ lib/cmetrics/tests/CMakeLists.txt (content)
⚔️ lib/cmetrics/tests/decoding.c (content)
⚔️ lib/cmetrics/tests/encoding.c (content)
⚔️ plugins/in_opentelemetry/opentelemetry_logs.c (content)
⚔️ plugins/in_opentelemetry/opentelemetry_prot.c (content)
⚔️ plugins/in_opentelemetry/opentelemetry_traces.c (content)
⚔️ plugins/in_opentelemetry/opentelemetry_utils.c (content)
⚔️ plugins/in_opentelemetry/opentelemetry_utils.h (content)
⚔️ src/CMakeLists.txt (content)
⚔️ src/flb_processor.c (content)
⚔️ src/opentelemetry/flb_opentelemetry_utils.c (content)
⚔️ tests/internal/CMakeLists.txt (content)
⚔️ tests/internal/opentelemetry.c (content)
⚔️ tests/runtime_shell/processor_conditional.sh (content)

These conflicts must be resolved before merging into master.
Resolve conflicts locally and push changes to this branch.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely summarizes the main changes: introducing JSON metrics support and exponential histogram capabilities for OpenTelemetry.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch opentelemetry-metrics-json
⚔️ Resolve merge conflicts (beta)
  • Auto-commit resolved conflicts to branch opentelemetry-metrics-json
  • Create stacked PR with resolved conflicts
  • Post resolved changes as copyable diffs in a comment

No actionable comments were generated in the recent review. 🎉

🧹 Recent nitpick comments
tests/internal/opentelemetry.c (1)

1023-1026: Redundant double-call of conversion helpers.

Throughout run_metrics_case, test_msgpack_object_to_double (and _to_u64) is called once inside TEST_CHECK and then again in the if guard — e.g., lines 1023–1026, 1082–1085, 1172–1175, etc. The conversion runs twice for every validated field. Save the return value once and reuse it:

Example fix
-                        TEST_CHECK(test_msgpack_object_to_double(field_obj,
-                                                                 &expected_value) == 0);
-                        if (test_msgpack_object_to_double(field_obj,
-                                                          &expected_value) == 0) {
+                        ret = test_msgpack_object_to_double(field_obj,
+                                                            &expected_value);
+                        TEST_CHECK(ret == 0);
+                        if (ret == 0) {

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
…ence

Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
lib/cmetrics/src/cmt_encode_splunk_hec.c (1)

20-28: ⚠️ Potential issue | 🟠 Major

Fix exp histogram count conversion before formatting.

metric->hist_count stores a double bit-pattern. Assigning metric->exp_hist_count directly means cmt_metric_hist_get_count_value() will decode garbage (it treats the bits as a double). Convert the count to the expected double bit representation before formatting.

🐛 Proposed fix (convert count to double-bit representation)
@@
-#include <cmetrics/cmt_encode_splunk_hec.h>
+#include <cmetrics/cmt_encode_splunk_hec.h>
+#include <cmetrics/cmt_math.h>
@@
-        metric->hist_count = metric->exp_hist_count;
+        metric->hist_count = cmt_math_d64_to_uint64((double) metric->exp_hist_count);

Also applies to: 569-605

lib/cmetrics/src/cmt_encode_opentelemetry.c (1)

38-57: ⚠️ Potential issue | 🟠 Major

Missing type check on variant before accessing as_kvlist.

The fetch_metadata_kvlist_key in cmt_encode_text.c (line 60) validates entry_variant->type != CFL_VARIANT_KVLIST before accessing the kvlist data. This version skips that check, so a non-kvlist variant will be interpreted as a kvlist pointer — likely causing a crash or silent corruption.

Proposed fix
     entry_variant = cfl_kvlist_fetch(kvlist, key);

-    if (entry_variant != NULL) {
+    if (entry_variant != NULL && entry_variant->type == CFL_VARIANT_KVLIST) {
         entry_kvlist = entry_variant->data.as_kvlist;
     }
     else {
         entry_kvlist = NULL;
     }
🤖 Fix all issues with AI agents
In `@lib/cmetrics/src/cmt_encode_opentelemetry.c`:
- Around line 410-430: The loop populates exemplars[] with possible NULL holes
because skipped entries use continue but *out_count is set to
array->entry_count; change the loop in the function that builds exemplars (where
create_exemplar_from_kvlist is called) to use a separate write index (e.g.,
dest_idx) so you only store created exemplars compactly, increment dest_idx on
success, and after the loop set *out_exemplars = exemplars and *out_count =
dest_idx; also ensure the existing failure cleanup still frees any allocated
exemplars up to dest_idx and frees exemplars on error.
- Around line 521-524: The block that resets number_data_point->value_case and
overwrites number_data_point->as_double should be removed so we don't clobber
the value previously set (particularly the UINT64 > INT64_MAX path which
deliberately set as_double from cmt_metric_get_uint64_value); locate the check
using number_data_point->value_case ==
OPENTELEMETRY__PROTO__METRICS__V1__NUMBER_DATA_POINT__VALUE_AS_DOUBLE and delete
the statements that reassign value_case and number_data_point->as_double (or
otherwise guard them so they don't overwrite an existing as_double set earlier
in the same function such as in the code around cmt_metric_get_uint64_value and
cmt_metric_get_value).
- Around line 3356-3412: The current loop over context->scope_metrics_list
creates identical metric objects for every scope (duplicating metrics across
scopes); instead create each metric once and attach it only to its originating
scope. Modify the code around the scope_index loop in pack_basic_type (the block
that calls initialize_metric, apply_metric_metadata_from_otlp_context,
append_sample_to_metric, and append_metric_to_scope_metrics) so that you
determine the correct target scope (from map or sample metadata) and either: (a)
build the metric once outside the scope loop and call
append_metric_to_scope_metrics only for the matching scope, or (b) guard the
per-scope creation with a check that the current scope_index equals the metric’s
source scope before creating/adding the metric; use initialize_metric,
apply_metric_metadata_from_otlp_context, append_sample_to_metric, and
append_metric_to_scope_metrics to locate and update the exact spots to implement
this filter.

In `@lib/cmetrics/src/cmt_exp_histogram.c`:
- Around line 125-155: The current logic frees metric->exp_hist_positive_buckets
and metric->exp_hist_negative_buckets early and may leave metric with new
positive buckets but no negative buckets on allocation failure; change to
allocate temporary buffers first (e.g., temp_pos =
calloc(positive_bucket_count,...), temp_neg =
calloc(negative_bucket_count,...)), check both allocations succeed, memcpy into
temporaries, then free the old
metric->exp_hist_positive_buckets/exp_hist_negative_buckets and assign
metric->exp_hist_positive_buckets = temp_pos and
metric->exp_hist_negative_buckets = temp_neg and update
metric->exp_hist_positive_count and metric->exp_hist_negative_count (and any
related scalar fields) only after the swap; ensure you free any allocated
temporaries on error before returning -1 so the metric remains consistent.

In `@lib/cmetrics/src/cmt_metric.c`:
- Around line 60-106: The value_type field is currently stored before the typed
value fields which can make readers (via cmt_metric_get_value_snapshot) see the
new type with stale values; for each setter (cmt_metric_set_double,
cmt_metric_set_int64, cmt_metric_set_uint64 and any add() implementation),
reorder the atomic stores so you store val/val_int64/val_uint64/timestamp first
and perform cmt_atomic_store(&metric->value_type, CMT_METRIC_VALUE_...) as the
last store to ensure snapshots are consistent and type acts as the stable
version marker.

In `@lib/cmetrics/tests/exp_histogram.c`:
- Around line 373-477: The variables encoded_cloudwatch_emf,
encoded_remote_write, and encoded_opentelemetry are not initialized and their
corresponding destroy functions (cmt_encode_cloudwatch_emf_destroy,
cmt_encode_prometheus_remote_write_destroy, cmt_encode_opentelemetry_destroy)
are called unconditionally; initialize these pointers to NULL before calling
cmt_encode_cloudwatch_emf_create, cmt_encode_prometheus_remote_write_create, and
cmt_encode_opentelemetry_create, and only call their destroy functions if the
pointer is non-NULL (i.e., guard each destroy with a NULL-check) to avoid
freeing uninitialized/invalid memory.

In `@plugins/in_opentelemetry/opentelemetry_utils.c`:
- Around line 157-228: The code reads content_type[expected_length] and
content_type[16] without ensuring the string is long enough; update
opentelemetry_content_type_matches to compute size_t content_len =
strlen(content_type) and return FLB_FALSE if content_len < expected_length
before accessing content_type[expected_length], and update
opentelemetry_is_grpc_content_type to check content_len = strlen(content_type)
and return FLB_FALSE if content_len < 16 before accessing content_type[16]; keep
existing strncasecmp checks but perform the length checks first to avoid
out-of-bounds reads.

In `@src/opentelemetry/flb_opentelemetry_utils.c`:
- Around line 624-647: The function parse_u64_from_msgpack_object currently
treats non-numeric strings and negative values as success (producing 0), so
update parse_u64_from_msgpack_object to explicitly reject negative integers and
invalid string conversions: for MSGPACK_OBJECT_NEGATIVE_INTEGER immediately
return error when object->via.i64 < 0; for MSGPACK_OBJECT_STR call
flb_otel_utils_convert_string_number_to_u64 but validate the result (ensure the
input string is numeric, not prefixed with '-', and the conversion indicates
success) and return error on failure instead of unconditionally writing 0; apply
the same defensive validation logic to the analogous conversion block in the
other function that uses flb_otel_utils_convert_string_number_to_u64.
🧹 Nitpick comments (7)
tests/internal/data/opentelemetry/metrics.json (1)

864-910: Misleading invalid_ prefix on test cases that expect successful decoding.

invalid_exponential_histogram_count_mismatch (line 864) and invalid_exponential_histogram_inconsistent_layout (line 911) are prefixed with invalid_ but both have "expected" blocks (i.e., they decode successfully). This could confuse future maintainers into thinking these should produce errors. Consider renaming to a prefix like valid_exponential_histogram_count_mismatch or quirky_ / lenient_ to indicate they are accepted despite being semantically odd.

lib/cmetrics/include/cmetrics/cmt_variant_utils.h (1)

513-532: Pre-existing asymmetry: unpack_cfl_variant_uint64 creates an INT64 variant from unsigned data.

Not introduced by this PR, but now more relevant since pack_cfl_variant correctly handles CFL_VARIANT_UINTmpack_write_u64. The unpack path converts mpack_type_uint back to CFL_VARIANT_INT (via cfl_variant_create_from_int64 with a cast), so a round-trip of a CFL_VARIANT_UINT value > INT64_MAX will silently corrupt data. Worth addressing in a follow-up.

lib/cmetrics/include/cmetrics/cmt_metric.h (1)

34-37: val_int64 declared as uint64_t is intentional but warrants a brief comment.

All three fields (val, val_int64, val_uint64) use uint64_t for uniform atomic operations, but val_int64 storing an int64 bit pattern in a uint64 container is non-obvious. A one-line comment would help future readers.

📝 Suggested comment
     /* counters and gauges */
     uint64_t val;
     uint64_t value_type;
-    uint64_t val_int64;
+    uint64_t val_int64;   /* int64 value stored as uint64 for atomic ops */
     uint64_t val_uint64;
lib/cmetrics/src/cmt_encode_prometheus_remote_write.c (1)

892-909: Conversion failure mapped to ALLOCATION_ERROR regardless of actual cause.

cmt_exp_histogram_to_explicit can fail for non-allocation reasons (e.g., non-finite base value, base <= 1.0). Mapping all failures to CMT_ENCODE_PROMETHEUS_REMOTE_WRITE_ALLOCATION_ERROR obscures the real cause. Consider introducing a more general error code or at least adding a comment.

lib/cmetrics/src/cmt_encode_prometheus.c (1)

470-520: Consider extracting a helper for exp‑histogram → explicit conversion.

The conversion logic is duplicated for static and dynamic metric paths; factoring it into a small helper would reduce drift risk and simplify future changes.

Also applies to: 546-592

lib/cmetrics/src/cmt_decode_opentelemetry.c (1)

1314-1547: Consider extracting a helper for repeated metric‑metadata insertion.

The metric->n_metadata cloning block is repeated for each metric type; a small helper would reduce duplication and keep behavior consistent.

lib/cmetrics/src/cmt_decode_msgpack.c (1)

820-832: Missing bounds check in exp histogram per-bucket callbacks.

unpack_exp_histogram_positive_bucket and unpack_exp_histogram_negative_bucket index directly into the allocated bucket arrays without validating index < exp_hist_positive_count / exp_hist_negative_count. While the bounds are implicitly safe (both the allocator and the iterator use the same peeked array length), the regular histogram bucket callback at line 699–702 performs explicit bounds checking for defense-in-depth. Adding similar guards here would be consistent and protect against future refactoring.

Proposed fix
 static int unpack_exp_histogram_positive_bucket(mpack_reader_t *reader, size_t index, void *context)
 {
     struct cmt_msgpack_decode_context *decode_context;
     decode_context = (struct cmt_msgpack_decode_context *) context;
+
+    if (decode_context->metric == NULL ||
+        decode_context->metric->exp_hist_positive_buckets == NULL ||
+        index >= decode_context->metric->exp_hist_positive_count) {
+        return CMT_DECODE_MSGPACK_CORRUPT_INPUT_DATA_ERROR;
+    }
+
     return cmt_mpack_consume_uint_tag(reader, &decode_context->metric->exp_hist_positive_buckets[index]);
 }

 static int unpack_exp_histogram_negative_bucket(mpack_reader_t *reader, size_t index, void *context)
 {
     struct cmt_msgpack_decode_context *decode_context;
     decode_context = (struct cmt_msgpack_decode_context *) context;
+
+    if (decode_context->metric == NULL ||
+        decode_context->metric->exp_hist_negative_buckets == NULL ||
+        index >= decode_context->metric->exp_hist_negative_count) {
+        return CMT_DECODE_MSGPACK_CORRUPT_INPUT_DATA_ERROR;
+    }
+
     return cmt_mpack_consume_uint_tag(reader, &decode_context->metric->exp_hist_negative_buckets[index]);
 }

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
lib/cmetrics/src/cmt_encode_prometheus_remote_write.c (1)

988-999: ⚠️ Potential issue | 🔴 Critical

Pre-existing null-pointer dereference if create_label returns NULL.

If create_label(NULL) fails, result is set to the error code but cfl_list_add(&dummy_label->_head, ...) on line 998 still executes with a NULL dummy_label. This is a pre-existing issue (same pattern exists for the HISTOGRAM and SUMMARY paths), not introduced by this PR, but the new EXP_HISTOGRAM code path inherits it.

🐛 Suggested fix
                 dummy_label = create_label(NULL);

                 if (dummy_label == NULL) {
                     result = CMT_ENCODE_PROMETHEUS_REMOTE_WRITE_ALLOCATION_ERROR;
+                    break;
                 }

                 cfl_list_add(&dummy_label->_head, &metric->labels);
🤖 Fix all issues with AI agents
In `@tests/internal/opentelemetry.c`:
- Around line 1359-1361: The variable ret is reused after calling
flb_opentelemetry_metrics_json_to_cmt, which causes
destroy_metrics_context_list(&context_list) to be skipped if later validation
calls (test_extract_label_values, cmt_gauge_get_val, cmt_counter_get_val, etc.)
modify ret; fix this by capturing the decode outcome immediately after
flb_opentelemetry_metrics_json_to_cmt (e.g., int decode_ret = ret or int
decode_status = flb_ret) and then continue using ret for subsequent validation
checks, but use decode_ret when deciding whether to call
destroy_metrics_context_list(&context_list) (i.e., call
destroy_metrics_context_list only if decode_ret ==
CMT_DECODE_OPENTELEMETRY_SUCCESS).
🧹 Nitpick comments (1)
tests/internal/opentelemetry.c (1)

775-825: run_metrics_case is a ~590-line monolithic function.

This function handles validation for all metric types (gauge, counter, histogram, summary, exp_histogram) in one flat block. While it works, the repetitive pattern for each metric type (find key → check name → check value → check labels) could be extracted into helpers, improving readability and making it easier to add new metric types.

Not blocking, just noting for future maintainability.

@edsiper
Copy link
Member Author

edsiper commented Feb 12, 2026

@codex review

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
lib/cmetrics/src/cmt_encode_opentelemetry.c (1)

38-57: ⚠️ Potential issue | 🔴 Critical

Missing variant type check — accessing as_kvlist on a non-kvlist variant is undefined behavior.

cfl_kvlist_fetch may return a variant of any type. This function accesses entry_variant->data.as_kvlist without verifying entry_variant->type == CFL_VARIANT_KVLIST. Compare with the equivalent helper in cmt_encode_text.c (lines 51–65 in the relevant snippets), which guards with entry_variant->type != CFL_VARIANT_KVLIST.

If the key maps to a non-kvlist variant (e.g., string, int), this returns a garbage pointer that callers will dereference.

Proposed fix — add the type guard
 struct cfl_kvlist *fetch_metadata_kvlist_key(struct cfl_kvlist *kvlist, char *key)
 {
     struct cfl_variant *entry_variant;
-    struct cfl_kvlist  *entry_kvlist;
 
     if (kvlist == NULL) {
         return NULL;
     }
 
     entry_variant = cfl_kvlist_fetch(kvlist, key);
-
-    if (entry_variant != NULL) {
-        entry_kvlist = entry_variant->data.as_kvlist;
-    }
-    else {
-        entry_kvlist = NULL;
+    if (entry_variant == NULL || entry_variant->type != CFL_VARIANT_KVLIST) {
+        return NULL;
     }
 
-    return entry_kvlist;
+    return entry_variant->data.as_kvlist;
 }
🧹 Nitpick comments (4)
lib/cmetrics/src/cmt_metric.c (1)

44-62: Trailing stores in add() race with concurrent callers.

After the CAS loop on metric->val succeeds (line 54), lines 58–61 execute without mutual exclusion. Two concurrent add() threads can interleave these stores, leaving val_int64/val_uint64 out of sync with the CAS'd val. Because add() always sets value_type = DOUBLE, cmt_metric_get_value_snapshot won't detect the inconsistency (type stays the same across the tear).

If val_int64/val_uint64 are only informational for the DOUBLE path this is tolerable, but worth a brief comment in the source so future maintainers don't rely on those fields for precise values after add().

lib/cmetrics/src/cmt_encode_opentelemetry.c (2)

445-471: compute_flat_scope_index returns 0 for both "valid first scope" and "error" — callers cannot distinguish.

When context is NULL, resource_index is out of bounds, or scope_index is out of bounds, this returns 0 — the same value as a legitimate first-scope index. resolve_target_scope_index (line 498) validates the result against scope_metrics_count, but compute_flat_scope_index itself masks errors from its callers. Consider returning a sentinel (e.g., SIZE_MAX) for errors, or an out-parameter pattern, so that callers can detect invalid input.


3019-3027: Redundant memset after calloc.

calloc already zero-initializes. The memset on line 3027 is a no-op.

Remove redundant memset
     context = calloc(1, sizeof(struct cmt_opentelemetry_context));
 
     if (context == NULL) {
         result = CMT_ENCODE_OPENTELEMETRY_ALLOCATION_ERROR;
 
         goto cleanup;
     }
 
-    memset(context, 0, sizeof(struct cmt_opentelemetry_context));
-
     context->cmt = cmt;
lib/cmetrics/tests/exp_histogram.c (1)

773-789: Unconditional cmt_encode_prometheus_remote_write_destroy when encoded_remote_write may be NULL.

If cmt_encode_prometheus_remote_write_create returns NULL (line 773), execution falls through to line 788 which calls destroy on NULL. This likely works because the underlying cfl_sds_destroy tolerates NULL, but for consistency with the guarded pattern used in test_exp_histogram_encoder_smoke (lines 498–500), consider guarding it.

Guard the destroy call
-    cmt_encode_prometheus_remote_write_destroy(encoded_remote_write);
+    if (encoded_remote_write != NULL) {
+        cmt_encode_prometheus_remote_write_destroy(encoded_remote_write);
+    }

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 80699d632a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

  Fix two JSON metrics decode issues in flb_opentelemetry_metrics.c.

  - Parse `asInt` datapoints as exact int64 instead of double conversion.
  - Set gauge/counter samples with typed cmetrics setters to avoid >2^53
    rounding loss.
  - Keep counter reset behavior consistent by clamping negative int values
    when reset is not allowed.
  - Preserve resource/scope metadata from JSON payloads by cloning:
    - resource metadata/attributes
    - resource_metrics schema_url
    - scope metadata/attributes
    - scope_metrics schema_url
    into `context->external_metadata` for OTLP re-encoding

Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
@edsiper
Copy link
Member Author

edsiper commented Feb 16, 2026

CI is stuck in windows in a oauth2 test (not related to this work)

@edsiper edsiper merged commit 46144a6 into master Feb 16, 2026
100 of 104 checks passed
@edsiper edsiper deleted the opentelemetry-metrics-json branch February 16, 2026 16:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments