New probabilistic sampler processor tests for legacy compatibility #32360

jmacd · 2024-04-12T21:46:26Z

Description: Adds new hard-coded test of the TraceID hashing function in the probabilisticsamplerprocessor. This will ensure that changes do not inadvertently modify the hashing function or associated logic for spans. Note that the Logs sampler logic includes a test with exact counts of sampled log records, which serves the same purpose.

Link to tracking Issue: #31918

Testing: This is a test added ahead of #31946, which refactors the hash-based decision but should not change its results.

Documentation: n/a

**Description:** This component has only one maintainer, and I have become very familiar with the component's code. #32360 #31946 #31894 **Documentation:** n/a

…ation, prepare for OTEP 235 support (#31946) **Description:** Refactors the probabilistic sampling processor to prepare it for more OTEP 235 support. This clarifies existing inconsistencies between tracing and logging samplers, see the updated README. The tracing priority mechanism applies a 0% or 100% sampling override (e.g., "1" implies 100% sampling), whereas the logging sampling priority mechanism supports variable-probability override (e.g., "1" implies 1% sampling). This pins down cases where no randomness is available, and organizes the code to improve readability. A new type called `randomnessNamer` carries the randomness information (from the sampling pacakge) and a name of the policy that derived it. When sampling priority causes the effective sampling probability to change, the value "sampling.priority" replaces the source of randomness, which is currently limited to "trace_id_hash" or the name of the randomess-source attribute, for logs. While working on #31894, I discovered that some inputs fall through to the hash function with zero bytes of input randomness. The hash function, computed on an empty input (for logs) or on 16 bytes of zeros (which OTel calls an invalid trace ID), would produce a fixed random value. So, for example, when logs are sampled and there is no TraceID and there is no randomness attribute value, the result will be sampled at approximately 82.9% and above. In the refactored code, an error is returned when there is no input randomness. A new boolean configuration field determines the outcome when there is an error extracting randomness from an item of telemetry. By default, items of telemetry with errors will not pass through the sampler. When `FailClosed` is set to false, items of telemetry with errors will pass through the sampler. The original hash function, which uses 14 bits of information, is structured as an "acceptance threshold", ultimately the test for sampling translated into a positive decision when `Randomness < AcceptThreshold`. In the OTEP 235 scheme, thresholds are rejection thresholds--this PR modifies the original 14-bit accept threshold into a 56-bit reject threshold, using Threshold and Randomness types from the sampling package. Reframed in this way, in the subsequent PR (i.e., #31894) the effective sampling probability will be seamlessly conveyed using OTEP 235 semantic conventions. Note, both traces and logs processors are now reduced to a function like this: ``` return commonSamplingLogic( ctx, l, lsp.sampler, lsp.failClosed, lsp.sampler.randomnessFromLogRecord, lsp.priorityFunc, "logs sampler", lsp.logger, ) ``` which is a generic function that handles the common logic on a per-item basis and ends in a single metric event. This structure makes it clear how traces and logs are processed differently and have different prioritization schemes, currently. This structure also makes it easy to introduce new sampler modes, as shown in #31894. After this and #31940 merge, the changes in #31894 will be relatively simple to review as the third part in a series. **Link to tracking Issue:** Depends on #31940. Part of #31918. **Testing:** Added. Existing tests already cover the exact random behavior of the current hashing mechanism. Even more testing will be introduced with the last step of this series. Note that #32360 is added ahead of this test to ensure refactoring does not change results. **Documentation:** Added. --------- Co-authored-by: Kent Quirk <kentquirk@gmail.com>

jmacd added 2 commits April 12, 2024 14:42

Add a new test

159b90c

more

10b0fa6

jmacd requested a review from jpkrohling as a code owner April 12, 2024 21:46

jmacd requested a review from a team April 12, 2024 21:46

jmacd added the Skip Changelog PRs that do not require a CHANGELOG.md entry label Apr 12, 2024

github-actions bot assigned bryan-aguilar Apr 12, 2024

github-actions bot added the processor/probabilisticsampler Probabilistic Sampler processor label Apr 12, 2024

jmacd changed the title ~~Jmacd/newtests~~ New probabilistic sampler processor tests for legacy compatibility Apr 12, 2024

comment

18faa1e

This was referenced Apr 12, 2024

Refactor the probabilistic sampler processor; add FailClosed configuration, prepare for OTEP 235 support #31946

Merged

Add jmacd@ as owner of probabilisticsamplerprocessor #32461

Merged

jpkrohling approved these changes Apr 19, 2024

View reviewed changes

jpkrohling merged commit 38c77fb into open-telemetry:main Apr 19, 2024
170 checks passed

github-actions bot added this to the next release milestone Apr 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New probabilistic sampler processor tests for legacy compatibility #32360

New probabilistic sampler processor tests for legacy compatibility #32360

jmacd commented Apr 12, 2024 •

edited

Loading

New probabilistic sampler processor tests for legacy compatibility #32360

New probabilistic sampler processor tests for legacy compatibility #32360

Conversation

jmacd commented Apr 12, 2024 • edited Loading

jmacd commented Apr 12, 2024 •

edited

Loading