Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API usability improvements in sampling package #31940

Merged
merged 84 commits into from
Apr 9, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
84 commits
Select commit Hold shift + click to select a range
e822a9b
Add t-value sampler draft
jmacd May 12, 2023
1bc6017
copy/import tracestate parser package
jmacd May 15, 2023
d1fd891
test ot tracestate
jmacd May 16, 2023
85e4472
tidy
jmacd May 16, 2023
bb75f8a
renames
jmacd May 16, 2023
6a57b77
testing two parsers w/ generic code
jmacd May 17, 2023
7fa8130
integrated
jmacd May 17, 2023
36230e7
Comments
jmacd May 17, 2023
7bae35c
revert two files
jmacd May 17, 2023
9010a67
Update with r, s, and t-value. Now using regexps and strings.IndexBy…
jmacd Jun 1, 2023
0e27e40
fix sampler build
jmacd Jun 1, 2023
efcdc3d
add support for s-value for non-consistent mode
jmacd Jun 1, 2023
939c758
WIP
jmacd Jul 10, 2023
b9a1e56
Merge branch 'main' of github.com:open-telemetry/opentelemetry-collec…
jmacd Aug 2, 2023
a31266c
use new proposed syntax see https://github.com/open-telemetry/opentel…
jmacd Aug 2, 2023
690cd64
update tracestate libs for new encoding
jmacd Aug 2, 2023
c8baf29
wip working on probabilistic sampler with two new modes: downsampler …
jmacd Aug 2, 2023
7f47e4a
unsigned implement split
jmacd Aug 3, 2023
422e0b2
two implementations
jmacd Aug 3, 2023
787b9fd
wip
jmacd Sep 5, 2023
ed36f03
Merge branch 'main' of github.com:open-telemetry/opentelemetry-collec…
jmacd Sep 6, 2023
d795210
Updates for OTEP 235
jmacd Sep 6, 2023
09000f7
wip TODO
jmacd Sep 6, 2023
a4d467b
versions.yaml
jmacd Sep 6, 2023
e373b9b
Add proportional sampler mode; comment on TODOs; create SamplerMode t…
jmacd Sep 7, 2023
fe6a085
back from internal
jmacd Oct 4, 2023
396efb1
wip
jmacd Oct 4, 2023
36de5dd
fix existing tests
jmacd Oct 6, 2023
f1aa0ad
:wip:
jmacd Oct 12, 2023
700734e
Update for rejection threshold
jmacd Nov 15, 2023
ae50bdd
Merge branch 'main' of github.com:open-telemetry/opentelemetry-collec…
jmacd Nov 15, 2023
a94b8e7
fix preexisting tests
jmacd Nov 16, 2023
4edcbcb
basic yes/no t-value sampling test
jmacd Nov 16, 2023
53bf119
Merge branch 'main' of github.com:open-telemetry/opentelemetry-collec…
jmacd Nov 29, 2023
3cdb957
add version for sampling pkg
jmacd Nov 29, 2023
e506847
more testing
jmacd Dec 7, 2023
2cddfeb
add probability to threshold with precision option
jmacd Dec 8, 2023
f69d6ee
ProbabilityToThresholdWithPrecision
jmacd Dec 8, 2023
cc02934
test coverage for equalizing and proportional
jmacd Dec 8, 2023
1eecc4a
config test
jmacd Dec 8, 2023
2159107
comments and notes
jmacd Dec 8, 2023
e0898a6
update README
jmacd Dec 8, 2023
d0991ed
Merge branch 'main' of github.com:open-telemetry/opentelemetry-collec…
jmacd Jan 10, 2024
a002774
Remove sampling pkg, it is now upstream
jmacd Feb 14, 2024
3a49922
Merge branch 'main' of github.com:open-telemetry/opentelemetry-collec…
jmacd Feb 28, 2024
fca0184
build w/ new sampling pkg
jmacd Feb 28, 2024
f11e0a4
more test coverage
jmacd Feb 29, 2024
3f495a6
more config tests
jmacd Feb 29, 2024
581095c
test precision underflow
jmacd Mar 1, 2024
7b8fd31
test warning logs
jmacd Mar 1, 2024
1a6be4f
hash_seed fixes
jmacd Mar 1, 2024
712cf17
wip
jmacd Mar 4, 2024
34c0d3b
aip
jmacd Mar 5, 2024
7742668
wip-refactoring
jmacd Mar 13, 2024
8d60168
refactor wip
jmacd Mar 14, 2024
3779caa
cleanup refactor
jmacd Mar 14, 2024
c261ac1
wip
jmacd Mar 14, 2024
34469e4
moving code
jmacd Mar 15, 2024
8dabf47
fix tests; round up small probs to avoid errors
jmacd Mar 15, 2024
d44afb5
preserve legacy behavior
jmacd Mar 15, 2024
1cf9991
logs handled sampling priority differently
jmacd Mar 15, 2024
365d35d
still two errors
jmacd Mar 18, 2024
12a3047
builds
jmacd Mar 19, 2024
8655f42
needs testing
jmacd Mar 19, 2024
468e6c6
fixing tests
jmacd Mar 21, 2024
23b4423
cleanup
jmacd Mar 21, 2024
07841e5
remove strict feature
jmacd Mar 21, 2024
6936bc4
tests fixed
jmacd Mar 21, 2024
c132f4c
wip
jmacd Mar 22, 2024
bd13ac9
typo
jmacd Mar 22, 2024
aa33b1c
more logs tests
jmacd Mar 22, 2024
06556dc
Add more comments
jmacd Mar 22, 2024
a4940e6
update README
jmacd Mar 22, 2024
4f616e9
Merge branch 'main' of github.com:open-telemetry/opentelemetry-collec…
jmacd Mar 22, 2024
b4ca3aa
Merge branch 'main' of github.com:open-telemetry/opentelemetry-collec…
jmacd Mar 25, 2024
fdd26ac
Merge branch 'main' of github.com:open-telemetry/opentelemetry-collec…
jmacd Mar 25, 2024
5097f43
revert these
jmacd Mar 25, 2024
aaff323
Remove new files too
jmacd Mar 25, 2024
dd979b9
comments
jmacd Mar 25, 2024
34004a2
chlog
jmacd Mar 25, 2024
f202a57
add tests
jmacd Mar 25, 2024
d401076
comment
jmacd Mar 25, 2024
2e41e69
fix that
jmacd Mar 25, 2024
eddffa8
Merge branch 'main' into jmacd/pkgsamplingup
MovieStoreGuy Apr 9, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 27 additions & 0 deletions .chloggen/sampling-improvements.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# Use this changelog template to create an entry for release notes.

# One of 'breaking', 'deprecation', 'new_component', 'enhancement', 'bug_fix'
change_type: enhancement

# The name of the component, or a single word describing the area of concern, (e.g. filelogreceiver)
component: pkg/sampling

# A brief description of the change. Surround your text with quotes ("") if it needs to start with a backtick (`).
note: Usability improvements in the sampling API.

# Mandatory: One or more tracking issues related to the change. You can use the PR number here if no issue exists.
issues: [31918]

# (Optional) One or more lines of additional information to render under the primary note.
# These lines will be padded with 2 spaces and then inserted directly into the document.
# Use pipe (|) for multiline entries.
subtext:

# If your change doesn't affect end users or the exported elements of any package,
# you should instead start your pull request title with [chore] or use the "Skip Changelog" label.
# Optional: The change log or logs in which this entry should be included.
# e.g. '[user]' or '[user, api]'
# Include 'user' if the change is relevant to end users.
# Include 'api' if there is a change to a library API.
# Default: '[user]'
change_logs: [api]
93 changes: 49 additions & 44 deletions pkg/sampling/doc.go
Original file line number Diff line number Diff line change
Expand Up @@ -37,53 +37,58 @@
// conditioned on how much sampling has already taken place, use the
// following pseudocode.
//
// func Setup() {
// // Get a fixed probability value from the configuration, in
// // the range (0, 1].
// probability := *FLAG_probability
// func Setup() {
// // Get a fixed probability value from the configuration, in
// // the range (0, 1].
// probability := *FLAG_probability
//
// // Calculate the sampling threshold from probability using 3
// // hex digits of precision.
// fixedThreshold, err = ProbabilityToThresholdWithPrecision(probability, 3)
// if err != nil {
// // error case: Probability is not valid.
// }
// }
// // Calculate the sampling threshold from probability using 3
// // hex digits of precision.
// fixedThreshold, err = ProbabilityToThresholdWithPrecision(probability, 3)
// if err != nil {
// // error case: Probability is not valid.
// }
// }
//
// func MakeDecision(tracestate string, tid TraceID) bool {
// // Parse the incoming tracestate
// ts, err := NewW3CTraceState(tracestate)
// if err != nil {
// // error case: Tracestate is ill-formed.
// }
// // For an absolute probability sample, we check the incoming
// // tracestate to see whether it was already sampled enough.
// if len(ts.OTelValue().TValue()) != 0 {
// // If the incoming tracestate was already sampled at
// // least as much as our threshold implies, then its
// // (rejection) threshold is higher. If so, then no
// // further sampling is called for.
// if ThresholdGreater(ts.OTelValue().TValueThreshold(), fixedThreshold) {
// return true
// }
// }
// var rnd Randomness
// // If the R-value is present, use it. If not, rely on TraceID
// // randomness. Note that OTLP v1.1.0 introduces a new Span flag
// // to convey trace randomness correctly, and if the context has
// // neither the randomness bit set or the R-value set, we need a
// // fallback, which can be to synthesize an R-value or to assume
// // the TraceID has sufficient randomness. This detail is left
// // out of scope.
// if rval, hasRval := ts.OTelValue().RValueRandomness(); hasRv {
// rnd = rval
// } else {
// rnd = TraceIDToRandomness(tid)
// }
//
// return fixedThreshold.ShouldSample(rnd)
// }
// func MakeDecision(tracestate string, tid TraceID) bool {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: the Go toolchain is reformatting this comment block for me. 🤷

// // Parse the incoming tracestate
// ts, err := NewW3CTraceState(tracestate)
// if err != nil {
// // error case: Tracestate is ill-formed.
// }
// // For an absolute probability sample, we check the incoming
// // tracestate to see whether it was already sampled enough.
// if threshold, hasThreshold := ts.OTelValue().TValueThreshold(); hasThreshold {
// // If the incoming tracestate was already sampled at
// // least as much as our threshold implies, then its
// // (rejection) threshold is higher. If so, then no
// // further sampling is called for.
// if ThresholdGreater(threshold, fixedThreshold) {
// // Do not update.
// return true
// }
// // The error below is ignored because we've tested
// // the equivalent condition above. This lowers the sampling
// // probability expressed in the tracestate T-value.
// _ = ts.OTelValue().UpdateThresholdWithSampling(fixedThreshold)
// }
// var rnd Randomness
// // If the R-value is present, use it. If not, rely on TraceID
// // randomness. Note that OTLP v1.1.0 introduces a new Span flag
// // to convey trace randomness correctly, and if the context has
// // neither the randomness bit set or the R-value set, we need a
// // fallback, which can be to synthesize an R-value or to assume
// // the TraceID has sufficient randomness. This detail is left
// // out of scope.
// if rv, hasRand := ts.OTelValue().RValueRandomness(); hasRand {
// rnd = rv
// } else {
// rnd = TraceIDToRandomness(tid)
// }
// return fixedThreshold.ShouldSample(rnd)
// }
//
// [W3C tracecontext specification]: https://www.w3.org/TR/trace-context/#tracestate-header
// [Tracestate handling]: https://opentelemetry.io/docs/specs/otel/trace/tracestate-handling/

package sampling // import "github.com/open-telemetry/opentelemetry-collector-contrib/pkg/sampling"
47 changes: 33 additions & 14 deletions pkg/sampling/oteltracestate.go
Original file line number Diff line number Diff line change
Expand Up @@ -97,16 +97,17 @@ func NewOpenTelemetryTraceState(input string) (OpenTelemetryTraceState, error) {
if otts.rnd, err = RValueToRandomness(value); err == nil {
otts.rvalue = value
} else {
// The zero-value for randomness implies always-sample;
// the threshold test is R < T, but T is not meaningful
// at zero, and this value implies zero adjusted count.
// RValueRandomness() will return false, the error
// accumulates and is returned below.
otts.rvalue = ""
otts.rnd = Randomness{}
}
case tValueFieldName:
if otts.threshold, err = TValueToThreshold(value); err == nil {
otts.tvalue = value
} else {
// TValueThreshold() will return false, the error
// accumulates and is returned below.
otts.tvalue = ""
otts.threshold = AlwaysSampleThreshold
}
Expand Down Expand Up @@ -148,29 +149,47 @@ func (otts *OpenTelemetryTraceState) TValueThreshold() (Threshold, bool) {
}

// UpdateTValueWithSampling modifies the TValue of this object, which
// changes its adjusted count. If the change of TValue leads to
// inconsistency (i.e., raising sampling probability), an error is
// returned.
func (otts *OpenTelemetryTraceState) UpdateTValueWithSampling(sampledThreshold Threshold, encodedTValue string) error {
// changes its adjusted count. It is not logical to modify a sampling
// probability in the direction of larger probability. This prevents
// accidental loss of adjusted count.
//
// If the change of TValue leads to inconsistency, an error is returned.
func (otts *OpenTelemetryTraceState) UpdateTValueWithSampling(sampledThreshold Threshold) error {
// Note: there was once a code path here that optimized for
// cases where a static threshold is used, in which case the
// call to TValue() causes an unnecessary allocation per data
// item (w/ a constant result). We have eliminated that
// parameter, due to the significant potential for mis-use.
// Therefore, this method always recomputes TValue() of the
// sampledThreshold (on success). A future method such as
// UpdateTValueWithSamplingFixedTValue() could extend this
// API to address this allocation, although it is probably
// not significant.
if len(otts.TValue()) != 0 && ThresholdGreater(otts.threshold, sampledThreshold) {
return ErrInconsistentSampling
}
// Note NeverSampleThreshold is the (exclusive) upper boundary
// of valid thresholds, so the test above permits never-
// sampled updates, in which case the TValue() here is empty.
otts.threshold = sampledThreshold
otts.tvalue = encodedTValue
otts.tvalue = sampledThreshold.TValue()
return nil
}

// AdjustedCount returns the adjusted count implied by this TValue.
// This term is defined here:
// https://opentelemetry.io/docs/specs/otel/trace/tracestate-probability-sampling/
// AdjustedCount returns the adjusted count for this item. If the
// TValue string is empty, this returns 0, otherwise returns
// Threshold.AdjustedCount().
func (otts *OpenTelemetryTraceState) AdjustedCount() float64 {
if len(otts.TValue()) == 0 {
if len(otts.tvalue) == 0 {
// Note: this case covers the zero state, where
// len(tvalue) == 0 and threshold == AlwaysSampleThreshold.
// We return 0 to indicate that no information is available.
return 0
}
return 1.0 / otts.threshold.Probability()
return otts.threshold.AdjustedCount()
}

// ClearTValue is used to unset TValue, in cases where it is
// ClearTValue is used to unset TValue, for use in cases where it is
// inconsistent on arrival.
func (otts *OpenTelemetryTraceState) ClearTValue() {
otts.tvalue = ""
Expand Down
6 changes: 3 additions & 3 deletions pkg/sampling/oteltracestate_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -105,7 +105,7 @@ func TestOpenTelemetryTraceStateTValueUpdate(t *testing.T) {
require.NotEqual(t, "", otts.RValue())

th, _ := TValueToThreshold("3")
require.NoError(t, otts.UpdateTValueWithSampling(th, "3"))
require.NoError(t, otts.UpdateTValueWithSampling(th))

require.Equal(t, "3", otts.TValue())
tv, hasTv := otts.TValueThreshold()
Expand All @@ -126,7 +126,7 @@ func TestOpenTelemetryTraceStateRTUpdate(t *testing.T) {
require.True(t, otts.HasAnyValue())

th, _ := TValueToThreshold("3")
require.NoError(t, otts.UpdateTValueWithSampling(th, "3"))
require.NoError(t, otts.UpdateTValueWithSampling(th))
otts.SetRValue(must(RValueToRandomness("00000000000003")))

const updated = "rv:00000000000003;th:3;a:b"
Expand Down Expand Up @@ -329,7 +329,7 @@ func TestUpdateTValueWithSampling(t *testing.T) {
newTh, err := ProbabilityToThreshold(test.prob)
require.NoError(t, err)

upErr := otts.UpdateTValueWithSampling(newTh, newTh.TValue())
upErr := otts.UpdateTValueWithSampling(newTh)

require.Equal(t, test.updateErr, upErr)

Expand Down
84 changes: 37 additions & 47 deletions pkg/sampling/probability.go
Original file line number Diff line number Diff line change
Expand Up @@ -11,12 +11,9 @@ import (
// ErrProbabilityRange is returned when a value should be in the range [1/MaxAdjustedCount, 1].
var ErrProbabilityRange = errors.New("sampling probability out of the range [1/MaxAdjustedCount, 1]")

// ErrPrecisionUnderflow is returned when a precision is too great for the range.
var ErrPrecisionUnderflow = errors.New("sampling precision is too great for the range")

// MinSamplingProbability is the smallest representable probability
// and is the inverse of MaxAdjustedCount.
const MinSamplingProbability = 1.0 / MaxAdjustedCount
const MinSamplingProbability = 1.0 / float64(MaxAdjustedCount)

// probabilityInRange tests MinSamplingProb <= prob <= 1.
func probabilityInRange(prob float64) bool {
Expand All @@ -26,67 +23,60 @@ func probabilityInRange(prob float64) bool {
// ProbabilityToThreshold converts a probability to a Threshold. It
// returns an error when the probability is out-of-range.
func ProbabilityToThreshold(prob float64) (Threshold, error) {
// Probability cases
if !probabilityInRange(prob) {
return AlwaysSampleThreshold, ErrProbabilityRange
}

scaled := uint64(math.Round(prob * MaxAdjustedCount))

return Threshold{
unsigned: MaxAdjustedCount - scaled,
}, nil
return ProbabilityToThresholdWithPrecision(prob, NumHexDigits)
}

// ProbabilityToThresholdWithPrecision is like ProbabilityToThreshold
// with support for reduced precision. The `prec` argument determines
// with support for reduced precision. The `precision` argument determines
// how many significant hex digits will be used to encode the exact
// probability.
func ProbabilityToThresholdWithPrecision(prob float64, prec uint8) (Threshold, error) {
func ProbabilityToThresholdWithPrecision(fraction float64, precision int) (Threshold, error) {
// Assume full precision at 0.
if prec == 0 {
return ProbabilityToThreshold(prob)
if precision == 0 {
precision = NumHexDigits
}
if !probabilityInRange(prob) {
if !probabilityInRange(fraction) {
return AlwaysSampleThreshold, ErrProbabilityRange
}
// Special case for prob == 1. The logic for revising precision
// that follows requires 0 < 1 - prob < 1.
if prob == 1 {
// Special case for prob == 1.
if fraction == 1 {
return AlwaysSampleThreshold, nil
}

// Adjust precision considering the significance of leading
// zeros. If we can multiply the rejection probability by 16
// and still be less than 1, then there is a leading zero of
// obligatory precision.
for reject := 1 - prob; reject*16 < 1; {
reject *= 16
prec++
}

// Check if leading zeros plus precision is above the maximum.
// This is called underflow because the requested precision
// leads to complete no significant figures.
if prec > NumHexDigits {
return AlwaysSampleThreshold, ErrPrecisionUnderflow
// Calculate the amount of precision needed to encode the
// threshold with reasonable precision. Here, we count the
// number of leading `0` or `f` characters and automatically
// add precision to preserve relative error near the extremes.
//
// Frexp() normalizes both the fraction and one-minus the
// fraction, because more digits of precision are needed if
// either value is near zero. Frexp returns an exponent <= 0.
//
// If `exp <= -4`, there will be a leading hex `0` or `f`.
// For every multiple of -4, another leading `0` or `f`
// appears, so this raises precision accordingly.
_, expF := math.Frexp(fraction)
_, expR := math.Frexp(1 - fraction)
precision = min(NumHexDigits, max(precision+expF/-hexBits, precision+expR/-hexBits))

// Compute the threshold
scaled := uint64(math.Round(fraction * float64(MaxAdjustedCount)))
threshold := MaxAdjustedCount - scaled

// Round to the specified precision, if less than the maximum.
if shift := hexBits * (NumHexDigits - precision); shift != 0 {
half := uint64(1) << (shift - 1)
threshold += half
threshold >>= shift
threshold <<= shift
Comment on lines +46 to +71
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note I removed ErrPrecisionUnderflow because it was not a meaningful error condition -- I would otherwise work around the problem. While investigating the head-sampler changes for OTel-Go I found a much nicer solution, copied here (with new testing).

}

scaled := uint64(math.Round(prob * MaxAdjustedCount))
rscaled := MaxAdjustedCount - scaled
shift := 4 * (14 - prec)
half := uint64(1) << (shift - 1)

rscaled += half
rscaled >>= shift
rscaled <<= shift

return Threshold{
unsigned: rscaled,
unsigned: threshold,
}, nil
}

// Probability is the sampling ratio in the range [MinSamplingProb, 1].
func (t Threshold) Probability() float64 {
return float64(MaxAdjustedCount-t.unsigned) / MaxAdjustedCount
return float64(MaxAdjustedCount-t.unsigned) / float64(MaxAdjustedCount)
}
7 changes: 5 additions & 2 deletions pkg/sampling/probability_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -171,7 +171,10 @@ func ExampleProbabilityToThreshold_verysmall() {
0x8p-56, // Skip 8 out of 2**56
0x10p-56, // Skip 0x10 out of 2**56
} {
tval, _ := ProbabilityToThreshold(prob)
// Note that precision is automatically raised for
// such small probabilities, because leading 'f' and
// '0' digits are discounted.
tval, _ := ProbabilityToThresholdWithPrecision(prob, 3)
fmt.Println(tval.TValue())
}

Expand Down Expand Up @@ -279,7 +282,7 @@ func TestProbabilityToThresholdWithPrecision(t *testing.T) {
for len(strip) > 0 && strip[0] == '0' {
strip = strip[1:]
}
rth, err := ProbabilityToThresholdWithPrecision(test.prob, uint8(len(strip)))
rth, err := ProbabilityToThresholdWithPrecision(test.prob, len(strip))
require.NoError(t, err)
require.Equal(t, round, rth.TValue())
})
Expand Down
Loading
Loading