Skip to content

Commit

Permalink
fix(w3c): fix traceparent validation (#4791)
Browse files Browse the repository at this point in the history
## Description

Running the W3C trace context parametric tests (introduced
[here](DataDog/system-tests#691)) surfaced four
issues with how the python tracer propagates traceparent and tracestate
headers. These issues are described below and are fixed in this PR.

### Edge Case 1 (most important)
Currently `trace_parent.islower()` is used to check if a traceparent
contains lower case letters and digits (uppercase letters are invalid).
Unfortunately `trace_parent.islower()` also evaluates to False if there
are no lowercase letters are present in the traceparent (ex:
traceparent=00-12312312003-1232131321 would be marked as invalid and
dropped). This is incorrect. Fix:
ddb9dc5.

Since the traceparent should contain 48 randomly generated hex values it
is unlikely to only contain digits. This edge case is very unlikely to
occur.

Current behavior:
- `00-123123123-23123213AAA` -> uppercase letter detected traceparent is
ignored
- `00-123123123-23123213132` -> no lowercase letter detected traceparent
is ignored [WRONG]
- `00-123123123-2312322aaaa` -> lowercase letter detected traceparent is
propagated


Expected behavior:
- `00-123123123-23123213132AAAA` -> uppercase detected traceparent is
ignored
- `00-123123123-23123213132` -> no uppercase detected traceparent is
propagated
- `00-123123123-23123213132aaaa` -> no uppercase detected traceparent is
propagated


### Edge Case 2

If traceparent contains an invalid characters (ex: period) do not
propagate the traceparent. Start a new trace. Currently we can propagate
invalid trace_ids (this is because we truncate the trace_id and only
convert the last 16 digits to hex, we ignore all the characters in the
first 16 digits). Fix: ae1e73e

### Edge Case 3

Ensure that traceflags contain ONLY two digits (`00` or `01`). `000` and
`001` are examples of invalid values that should not be propagated. Fix:
adc4f80


### Edge Case 4

Ensure traceparent version contain ONLY two digits (`00` or `01`). `000`
and `001` are examples of invalid values that should not be propagated.
Fix: 2e416c5


### Edge Case 5

The W3C specification is additive. Although only traceparents with the
version `00` are supported we should attempt to parse traceparent with
different formats.

Example:
"01-12345678901234567890123456789012-1234567890123456-01-what-the-future-will-be-like"

In the traceparent example above we should parse the version, trace id,
span id, and sample flag and ignore the trailing values. Fix:
4cebe7514682787f6068754037fba569f4af3d60

### Edge Case 6

This edge case is not addressed in this PR. I am including it here for
completeness. In the W3C tracecontext specification a tracer SHOULD set
two http headers, one header should set the tracestate and the other
should set the traceparent. However, if duplicate traceparent and
tracestate headers are received, the tracer must processes and reconcile
these headers (logic: 1) receive duplicate traceparent headers with
different values -> drop these values and start a new trace. 2) receive
duplicate tracestates with different tags -> combine the tracestates and
propagate it).

In the ddtrace library http headers are added to a dictionary, so
duplicate http header values are overwritten. To address this edge case
the tracer must be able to store detect and store all key value pairs in
http headers. Since this edge case requires significant changes to
distributed tracing and does not resolve a critical issue this work can
be deferred.

## Testing 

Testing will covered by system tests.

## Reviewer Checklist
- [ ] Title is accurate.
- [ ] Description motivates each change.
- [ ] No unnecessary changes were introduced in this PR.
- [ ] Avoid breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes unless absolutely necessary.
- [ ] Tests provided or description of manual testing performed is
included in the code or PR.
- [ ] Release note has been added and follows the [library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/contributing.html#Release-Note-Guidelines),
or else `changelog/no-changelog` label added.
- [ ] All relevant GitHub issues are correctly linked.
- [ ] Backports are identified and tagged with Mergifyio.

Co-authored-by: Brett Langdon <brett.langdon@datadoghq.com>
Co-authored-by: Tahir H. Butt <tahir.butt@datadoghq.com>
  • Loading branch information
3 people authored Dec 19, 2022
1 parent 2690b2e commit a9ee02f
Show file tree
Hide file tree
Showing 2 changed files with 80 additions and 40 deletions.
73 changes: 46 additions & 27 deletions ddtrace/propagation/http.py
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,23 @@ def _possible_header(header):
_POSSIBLE_HTTP_HEADER_TRACESTATE = _possible_header(_HTTP_HEADER_TRACESTATE)


# https://www.w3.org/TR/trace-context/#traceparent-header-field-values
# Future proofing: The traceparent spec is additive, future traceparent versions may contain more than 4 values
# The regex below matches the version, trace id, span id, sample flag, and end-string/future values (if version>00)
_TRACEPARENT_HEX_REGEX = re.compile(
r"""
^ # Start of string
([a-f0-9]{2})- # 2 character hex version
([a-f0-9]{32})- # 32 character hex trace id
([a-f0-9]{16})- # 16 character hex span id
([a-f0-9]{2}) # 2 character hex sample flag
(-.+)? # optional, start of any additional values
$ # end of string
""",
re.VERBOSE,
)


def _extract_header_value(possible_header_names, headers, default=None):
# type: (FrozenSet[str], Dict[str, str], Optional[str]) -> Optional[str]
for header in possible_header_names:
Expand Down Expand Up @@ -550,35 +567,41 @@ def _get_traceparent_values(tp):
Otherwise we extract the trace-id, span-id, and sampling priority from the
traceparent header.
"""
valid_tp_values = _TRACEPARENT_HEX_REGEX.match(tp.strip())
if valid_tp_values is None:
raise ValueError("Invalid traceparent version: %s" % tp)

(
version,
trace_id_hex,
span_id_hex,
trace_flags_hex,
future_vals,
) = valid_tp_values.groups() # type: Tuple[str, str, str, str, Optional[str]]

version, trace_id_hex, span_id_hex, trace_flags_hex = tp.strip().split("-")
# check version is a valid hexadecimal, if not it's invalid we will move on to the next prop method
int(version, 16)
# https://www.w3.org/TR/trace-context/#version
if version == "ff":
raise ValueError("'ff' is an invalid traceparent version")
# currently 00 is the only version format, but if future versions come up we may need to add changes
if version != "00":
# https://www.w3.org/TR/trace-context/#version
raise ValueError("ff is an invalid traceparent version: %s" % tp)
elif version != "00":
# currently 00 is the only version format, but if future versions come up we may need to add changes
log.warning("unsupported traceparent version:%r, still attempting to parse", version)
elif version == "00" and future_vals is not None:
raise ValueError("Traceparents with the version `00` should contain 4 values delimited by a dash: %s" % tp)

if len(trace_id_hex) == 32 and len(span_id_hex) == 16 and len(trace_flags_hex) >= 2:
trace_id = _hex_id_to_dd_id(trace_id_hex)
span_id = _hex_id_to_dd_id(span_id_hex)
trace_id = _hex_id_to_dd_id(trace_id_hex)
span_id = _hex_id_to_dd_id(span_id_hex)

# All 0s are invalid values
if trace_id == 0:
raise ValueError("0 value for trace_id is invalid")
if span_id == 0:
raise ValueError("0 value for span_id is invalid")
# All 0s are invalid values
if trace_id == 0:
raise ValueError("0 value for trace_id is invalid")
if span_id == 0:
raise ValueError("0 value for span_id is invalid")

trace_flags = _hex_id_to_dd_id(trace_flags_hex)
# there's currently only one trace flag, which denotes sampling priority
# was set to keep "01" or drop "00"
# trace flags is a bit field: https://www.w3.org/TR/trace-context/#trace-flags
sampling_priority = trace_flags & 0x1

else:
raise ValueError("W3C traceparent hex length incorrect: %s" % tp)
trace_flags = _hex_id_to_dd_id(trace_flags_hex)
# there's currently only one trace flag, which denotes sampling priority
# was set to keep "01" or drop "00"
# trace flags is a bit field: https://www.w3.org/TR/trace-context/#trace-flags
sampling_priority = trace_flags & 0x1

return trace_id, span_id, sampling_priority

Expand Down Expand Up @@ -644,10 +667,6 @@ def _extract(headers):
if tp is None:
log.debug("no traceparent header")
return None
# uppercase char in tp makes it invalid:
# https://www.w3.org/TR/trace-context/#traceparent-header-field-values
if not tp.islower():
raise ValueError("uppercase characters are not allowed in traceparent")
trace_id, span_id, sampling_priority = _TraceContext._get_traceparent_values(tp)
except (ValueError, AssertionError):
log.exception("received invalid w3c traceparent: %s ", tp)
Expand Down
47 changes: 34 additions & 13 deletions tests/tracer/test_propagation.py
Original file line number Diff line number Diff line change
Expand Up @@ -444,34 +444,52 @@ def test_tracecontext_get_sampling_priority(sampling_priority_tp, sampling_prior
None,
),
(
"01-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01",
"01-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01-what-the-future-looks-like",
# tp, trace_id, span_id, sampling_priority
(11803532876627986230, 67667974448284343, 1),
["unsupported traceparent version:'01', still attempting to parse"],
None,
),
(
"0-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01",
"00-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01-v00-can-not-have-future-values",
# tp, trace_id, span_id, sampling_priority
(11803532876627986230, 67667974448284343, 1),
["unsupported traceparent version:'0', still attempting to parse"],
[],
ValueError,
),
(
"0-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01",
# tp, trace_id, span_id, sampling_priority
None,
[],
ValueError,
),
(
"ff-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01",
# tp, trace_id, span_id, sampling_priority
None,
[],
ValueError,
),
(
"00-4BF92K3577B34dA6C3ce929d0e0e4736-00f067aa0ba902b7-01",
# tp, trace_id, span_id, sampling_priority
None,
[],
ValueError,
),
(
"00-f92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01",
# tp, trace_id, span_id, sampling_priority
None,
[
"received invalid w3c traceparent: 00-f92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01.",
"W3C traceparent hex length incorrect: 00-f92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01",
],
[],
ValueError,
),
( # we still parse the trace flag and analyze the it as a bit field
"00-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-02",
# tp, trace_id, span_id, sampling_priority
(11803532876627986230, 67667974448284343, 0),
None,
[],
None,
),
],
Expand All @@ -481,7 +499,10 @@ def test_tracecontext_get_sampling_priority(sampling_priority_tp, sampling_prior
"invalid_0_value_for_span_id",
"traceflag_00",
"unsupported_version",
"version_00_with_unsupported_trailing_values",
"short_version",
"invalid_version",
"traceparent_contains_uppercase_chars",
"short_trace_id",
"unknown_trace_flag",
],
Expand All @@ -490,14 +511,14 @@ def test_extract_traceparent(caplog, headers, expected_tuple, expected_logging,
with caplog.at_level(logging.DEBUG):
if expected_exception:
with pytest.raises(expected_exception):
traceparent_values = _TraceContext._get_traceparent_values(headers)
assert traceparent_values == expected_tuple
_TraceContext._get_traceparent_values(headers)
else:
traceparent_values = _TraceContext._get_traceparent_values(headers)
assert traceparent_values == expected_tuple
if caplog.text or expected_logging:
for expected_log in expected_logging:
assert expected_log in caplog.text

if caplog.text or expected_logging:
for expected_log in expected_logging:
assert expected_log in caplog.text


@pytest.mark.parametrize(
Expand Down

0 comments on commit a9ee02f

Please sign in to comment.