processor_sampling: perf improvement in tail sampling reconciliation by edsiper · Pull Request #11441 · fluent/fluent-bit

edsiper · 2026-02-09T22:38:36Z

This PR adds an in-tree benchmarking workflow for processor_sampling and introduces a runtime toggle to compare the legacy and optimized tail reconcile implementations without switching
branches.

What’s included

Added new benchmark binary: flb-bench-processor_sampling
- New file: benchmarks/flb-bench-processor_sampling.c
- CMake wiring in benchmarks/CMakeLists.txt
Extended benchmark modes to support tail reconcile profiling:
- -m tail-reconcile
- -L 0|1 to select optimized vs legacy reconcile path
Updated tail sampling implementation in plugins/processor_sampling/sampling_tail.c:
- New config option: legacy_reconcile (default false)
- Explicit reconcile implementations:
  - reconcile_and_create_ctrace_legacy(...)
  - reconcile_and_create_ctrace_optimized(...)
- Runtime dispatch between both paths
- Benchmark-only helper: sampling_tail_bench_reconcile(...) (guarded by FLB_SAMPLING_BENCH)

Why

Enables deterministic A/B performance testing and easier perf analysis.
Improves developer workflow for validating sampling reconcile changes and catching regressions.

Observed benchmark result (same workload)

Params: -m tail-reconcile -i 200000 -w 20000 -s 20 -a 4 -c 4096 -t 50000 -S 12345

Metric	Legacy (-L 1)	Optimized (-L 0)	Delta
Total time	4.132s	3.765s	-8.9%
Per span	1033.02ns	941.14ns	-8.9%
Throughput	968,038/s	1,062,541/s	+9.8%
Perf samples	17,612	16,004	-1,608

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

Summary by CodeRabbit

New Features
- Added configurable legacy vs. optimized trace reconciliation for sampling.
Tests
- Added a benchmark tool to measure trace-sampling processor performance across modes and configurations.
Chores
- Enabled processor-sampling benchmarks in the build.
- Tightened commit subject prefix validation and added special mapping for benchmark-related paths.

coderabbitai · 2026-02-09T22:39:06Z

📝 Walkthrough

Walkthrough

Adds a new processor-sampling benchmark executable and harness, implements an optimized cache-backed reconciliation path with a legacy toggle in processor_sampling tail code, and tightens commit-prefix validation with special handling for the benchmarks path.

Changes

Cohort / File(s)	Summary
Benchmark CMake `benchmarks/CMakeLists.txt`	Adds new executable target `flb-bench-processor_sampling`, compiles with `FLB_SAMPLING_BENCH=1`, and links to `fluent-bit-static` and thread libs.
Benchmark tool `benchmarks/flb-bench-processor_sampling.c`	New benchmark program: CLI parsing, synthetic trace/span generation, deterministic RNG, probabilistic and tail-reconcile modes, warmup/iteration loops, metrics output, and two weak hooks (`flb_input_trace_append_skip_processor_stages`, `sampling_conditions_check`).
Sampling tail reconciliation `plugins/processor_sampling/sampling_tail.c`	Adds `legacy_reconcile` setting, reconcile cache types (`reconcile_resource_entry`, `reconcile_scope_entry`), cache management helpers, optimized and legacy reconcile flows, cache cleanup, and a bench-facing reconcile entrypoint; reconciler selects path by setting.
Commit prefix check `.github/scripts/commit_prefix_check.py`	Tightens subject-prefix regex to require a token after `:`, maps `benchmarks/` paths to `benchmarks:` prefix inference, and narrows extracted prefix capture group.

Sequence Diagram

sequenceDiagram
    autonumber
    participant CLI as Benchmark CLI
    participant Synth as Trace Synthesizer
    participant Sampling as Sampling Plugin
    participant Cache as Reconcile Cache / Registry
    participant Reporter as Metrics Reporter

    CLI->>CLI: parse args, warmup, iterations
    loop per iteration
        CLI->>Synth: request synthetic trace
        Synth-->>CLI: generated ctrace
        CLI->>Sampling: submit trace for processing
        Sampling->>Cache: consult/update reconciliation cache (optimized) or follow legacy path
        Cache-->>Sampling: mapped/created spans/ctrace
        Sampling-->>CLI: processed trace (kept/dropped)
        CLI->>Reporter: record spans processed
    end
    Reporter-->>CLI: compute elapsed, throughput, per-span time
    CLI->>CLI: output summary

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

github: commit_linter: Make more flexibility for commit linter #11251 — Related edits to .github/scripts/commit_prefix_check.py affecting commit-prefix detection and path→prefix inference.

Suggested reviewers

cosmo0920
niedbalski
patrick-stephens
celalettin1286

Poem

🐰 I hopped through traces, spans in line,

Caches nested snug, legacy or fine.
Benchmarks warmed, iterations spun,
Counting hops until the work is done.
🥕✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 13.04% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'processor_sampling: perf improvement in tail sampling reconciliation' accurately reflects the main change—a performance optimization in tail sampling reconciliation. It is specific, concise, and clearly summarizes the primary objective of the PR.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch processor_sampling-trace-perf

No actionable comments were generated in the recent review. 🎉

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>

…B benchmark Introduce a reproducible in-tree benchmark workflow to compare tail sampling reconcile implementations and validate performance changes with perf. changes: - processor_sampling (tail): - add `legacy_reconcile` config option (default: `false`) - split reconcile implementation into explicit paths: - `reconcile_and_create_ctrace_legacy(...)` - `reconcile_and_create_ctrace_optimized(...)` - select reconcile path at runtime in dispatch logic - add benchmark-only helper: - `sampling_tail_bench_reconcile(...)` - compiled under `FLB_SAMPLING_BENCH` - benchmarks: - extend `flb-bench-processor_sampling` with: - `-m tail-reconcile` - `-L 0|1` to toggle optimized vs legacy - keep probabilistic sampling benchmark mode - compile benchmark with required sampling sources and `FLB_SAMPLING_BENCH=1` for direct reconcile profiling About the benchmark - Enable fast A/B testing of reconcile code paths without branch switching. - Provide perf-friendly, deterministic comparisons for regression tracking. Measured result (same workload) `-m tail-reconcile -i 200000 -w 20000 -s 20 -a 4 -c 4096 -t 50000 -S 12345` +----------------------+----------------+----------------+---------------------+ | Metric | Legacy (-L 1) | Optimized (-L0)| Delta | +----------------------+----------------+----------------+---------------------+ | Total time | 4.132 s | 3.765 s | -0.367 s (-8.9%) | | Per span | 1033.02 ns | 941.14 ns | -91.88 ns (-8.9%) | | Throughput | 968,038/s | 1,062,541/s | +94,503/s (+9.8%) | | perf samples | 17,612 | 16,004 | -1,608 | +----------------------+----------------+----------------+---------------------+ Perf hotspot note (legacy sample) - `run_tail_reconcile`: - `create_trace` ~59.65% - `sampling_tail_bench_reconcile` ~31.33% - `reconcile_and_create_ctrace_legacy` ~8.86% - destroy chain (`ctr_destroy` subtree) dominates inside reconcile path Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 26342e919f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-02-09T22:40:26Z

src/flb_record_accessor.c

+                    keymap_cache[keymap_idx] = flb_ra_value_from_object(&val,
+                                                                        rp->key->subkeys,
+                                                                        FLB_TRUE);


Replace undefined flb_ra_value_from_object call

The new cache path in flb_ra_translate_check calls flb_ra_value_from_object, but there is no definition or declaration for this symbol anywhere in the repo (only this call site). That means the build will fail with an undefined symbol (or at minimum an implicit declaration error) whenever this file is compiled. You likely intended to reuse flb_ra_key_to_value_ext or a helper from flb_ra_key.c, but as written this breaks compilation for all targets that include src/flb_record_accessor.c.

Useful? React with 👍 / 👎.

Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@benchmarks/flb-bench-processor_sampling.c`:
- Around line 467-468: Guard against division by zero before computing
ns_per_span and spans_per_sec: check if spans_processed is 0 and handle that
case (e.g., set ns_per_span = 0.0 and spans_per_sec = 0.0 or skip reporting)
instead of performing the division; update the code around the existing
expressions referencing ns_per_span, spans_per_sec and spans_processed in
benchmarks/flb-bench-processor_sampling.c so the calculation only occurs when
spans_processed > 0 and the zero-case produces a safe, deterministic value.

🧹 Nitpick comments (3)

plugins/processor_sampling/sampling_tail.c (3)

523-540: Duplicate comment block (copy-paste leftover).

Lines 523–530 and 533–540 contain the same comment verbatim. Remove one.

Proposed fix

-        /*
-         * Detach the span from its previous context completely and
-         * re-attach it to the new one. If we only move the local list
-         * reference (span->_head) the span would still belong to the
-         * original ctrace context which later on might lead to use after
-         * free issues when the new context is destroyed. Make sure to
-         * update all references.
-         */
-
-        /* detach from the original scope span and global list */
         /*
          * Detach the span from its previous context completely and
          * re-attach it to the new one. If we only move the local list
          * reference (span->_head) the span would still belong to the
          * original ctrace context which later on might lead to use after
          * free issues when the new context is destroyed. Make sure to
          * update all references.
          */
+
+        /* detach from the original scope span and global list */
         cfl_list_del(&span->_head);

762-766: Stack settings is partially initialized — uninitialized list_ctraces is a latent risk.

Only span_reg, legacy_reconcile, and decision_wait are assigned. While the current reconcile functions don't access other fields, the uninitialized list_ctraces (a cfl_list with garbage pointers) could cause a crash if any future code path traverses it. A memset(&settings, 0, sizeof(settings)) would be safer.

Proposed fix

     struct sampling_settings settings;
 
+    memset(&settings, 0, sizeof(settings));
     settings.span_reg = span_reg;
     settings.legacy_reconcile = legacy_reconcile;
     settings.decision_wait = decision_wait;

552-554: Typo: "remote" → "remove".

-        /* remote t_span entry */
+        /* remove t_span entry */

benchmarks/flb-bench-processor_sampling.c

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In @.github/scripts/commit_prefix_check.py:
- Line 23: The regex PREFIX_RE currently captures the prefix plus trailing
whitespace/char which causes subject_prefix (from the match at line using
.group()) to be wrong; change PREFIX_RE to put a capture group around just the
prefix portion (e.g. capture up through the colon) and then update the code that
extracts the prefix (the place assigning subject_prefix) to use match.group(1)
instead of .group(); verify any other usages (like in detect_bad_squash where
PREFIX_RE.match(l) is only used as a boolean) still behave correctly and adjust
any other .group() calls to group(1) when the prefix string is needed.

.github/scripts/commit_prefix_check.py

Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>

Update commit prefix regex handling in `.github/scripts/commit_prefix_check.py` so subject prefix extraction returns only the prefix token (e.g. `build:`) instead of including trailing content. Changes: - make `PREFIX_RE` capture just the prefix in group 1: - from: `^[a-z0-9_]+:\s+\S` - to: `^([a-z0-9_]+:)\s+\S` - update subject prefix extraction to use `match.group(1)` Why: - previous extraction used `.group()` and could include extra matched text, causing incorrect prefix comparisons during validation. Notes: - `detect_bad_squash()` uses `PREFIX_RE.match(...)` only as a boolean; behavior remains unchanged there. Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>

edsiper requested a review from cosmo0920 as a code owner February 9, 2026 22:38

github-actions bot added the docs-required label Feb 9, 2026

edsiper had a problem deploying to pr February 9, 2026 22:38 — with GitHub Actions Failure

edsiper added 2 commits February 9, 2026 16:39

benchmarks: add processor_sampling

a2e46c4

Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>

edsiper force-pushed the processor_sampling-trace-perf branch from 26342e9 to c30d4f0 Compare February 9, 2026 22:39

edsiper added this to the Fluent Bit v5.0 milestone Feb 9, 2026

edsiper temporarily deployed to pr February 9, 2026 22:39 — with GitHub Actions Inactive

chatgpt-codex-connector bot reviewed Feb 9, 2026

View reviewed changes

github: scripts: adjust commit prefix checker for benchmarks

c9a3769

Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>

edsiper requested review from celalettin1286, niedbalski and patrick-stephens as code owners February 9, 2026 22:47

edsiper temporarily deployed to pr February 9, 2026 22:48 — with GitHub Actions Inactive

edsiper had a problem deploying to pr February 9, 2026 22:48 — with GitHub Actions Failure

edsiper temporarily deployed to pr February 9, 2026 22:48 — with GitHub Actions Inactive

coderabbitai bot reviewed Feb 9, 2026

View reviewed changes

benchmarks/flb-bench-processor_sampling.c Outdated Show resolved Hide resolved

coderabbitai bot reviewed Feb 9, 2026

View reviewed changes

.github/scripts/commit_prefix_check.py Outdated Show resolved Hide resolved

benchmarks: processor_sampling: add guards for zero spans

016c4e9

Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>

edsiper temporarily deployed to pr February 9, 2026 22:59 — with GitHub Actions Inactive

edsiper temporarily deployed to pr February 9, 2026 23:01 — with GitHub Actions Inactive

edsiper temporarily deployed to pr February 9, 2026 23:20 — with GitHub Actions Inactive

edsiper temporarily deployed to pr February 9, 2026 23:21 — with GitHub Actions Inactive

edsiper merged commit e253e3f into master Feb 10, 2026
61 checks passed

edsiper deleted the processor_sampling-trace-perf branch February 10, 2026 15:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

processor_sampling: perf improvement in tail sampling reconciliation #11441

processor_sampling: perf improvement in tail sampling reconciliation #11441
edsiper merged 5 commits intomasterfrom
processor_sampling-trace-perf

edsiper commented Feb 9, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Feb 9, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Feb 9, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

edsiper commented Feb 9, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Feb 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

edsiper commented Feb 9, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 9, 2026 •

edited

Loading