Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(w3c): fix traceparent validation #4791

Merged
merged 15 commits into from
Dec 19, 2022

Conversation

mabdinur
Copy link
Contributor

@mabdinur mabdinur commented Dec 14, 2022

Description

Running the W3C trace context parametric tests (introduced here) surfaced four issues with how the python tracer propagates traceparent and tracestate headers. These issues are described below and are fixed in this PR.

Edge Case 1 (most important)

Currently trace_parent.islower() is used to check if a traceparent contains lower case letters and digits (uppercase letters are invalid). Unfortunately trace_parent.islower() also evaluates to False if there are no lowercase letters are present in the traceparent (ex: traceparent=00-12312312003-1232131321 would be marked as invalid and dropped). This is incorrect. Fix: ddb9dc5.

Since the traceparent should contain 48 randomly generated hex values it is unlikely to only contain digits. This edge case is very unlikely to occur.

Current behavior:

  • 00-123123123-23123213AAA -> uppercase letter detected traceparent is ignored
  • 00-123123123-23123213132 -> no lowercase letter detected traceparent is ignored [WRONG]
  • 00-123123123-2312322aaaa -> lowercase letter detected traceparent is propagated

Expected behavior:

  • 00-123123123-23123213132AAAA -> uppercase detected traceparent is ignored
  • 00-123123123-23123213132 -> no uppercase detected traceparent is propagated
  • 00-123123123-23123213132aaaa -> no uppercase detected traceparent is propagated

Edge Case 2

If traceparent contains an invalid characters (ex: period) do not propagate the traceparent. Start a new trace. Currently we can propagate invalid trace_ids (this is because we truncate the trace_id and only convert the last 16 digits to hex, we ignore all the characters in the first 16 digits). Fix: ae1e73e

Edge Case 3

Ensure that traceflags contain ONLY two digits (00 or 01). 000 and 001 are examples of invalid values that should not be propagated. Fix: adc4f80

Edge Case 4

Ensure traceparent version contain ONLY two digits (00 or 01). 000 and 001 are examples of invalid values that should not be propagated. Fix: 2e416c5

Edge Case 5

The W3C specification is additive. Although only traceparents with the version 00 are supported we should attempt to parse traceparent with different formats.

Example: "01-12345678901234567890123456789012-1234567890123456-01-what-the-future-will-be-like"

In the traceparent example above we should parse the version, trace id, span id, and sample flag and ignore the trailing values. Fix: 4cebe7514682787f6068754037fba569f4af3d60

Edge Case 6

This edge case is not addressed in this PR. I am including it here for completeness. In the W3C tracecontext specification a tracer SHOULD set two http headers, one header should set the tracestate and the other should set the traceparent. However, if duplicate traceparent and tracestate headers are received, the tracer must processes and reconcile these headers (logic: 1) receive duplicate traceparent headers with different values -> drop these values and start a new trace. 2) receive duplicate tracestates with different tags -> combine the tracestates and propagate it).

In the ddtrace library http headers are added to a dictionary, so duplicate http header values are overwritten. To address this edge case the tracer must be able to store detect and store all key value pairs in http headers. Since this edge case requires significant changes to distributed tracing and does not resolve a critical issue this work can be deferred.

Testing

Testing will covered by system tests.

Reviewer Checklist

  • Title is accurate.
  • Description motivates each change.
  • No unnecessary changes were introduced in this PR.
  • Avoid breaking API changes unless absolutely necessary.
  • Tests provided or description of manual testing performed is included in the code or PR.
  • Release note has been added and follows the library release note guidelines, or else changelog/no-changelog label added.
  • All relevant GitHub issues are correctly linked.
  • Backports are identified and tagged with Mergifyio.

@mabdinur mabdinur added the changelog/no-changelog A changelog entry is not required for this PR. label Dec 14, 2022
@mabdinur mabdinur force-pushed the w3c/munir/fix-system-test-edgecases branch 2 times, most recently from 8b54d09 to 13a9c19 Compare December 14, 2022 23:37
edge case 2 future proofing: traceparent may contain more than 4 values
@mabdinur mabdinur force-pushed the w3c/munir/fix-system-test-edgecases branch from 13a9c19 to ae1e73e Compare December 14, 2022 23:58
@mabdinur mabdinur marked this pull request as ready for review December 15, 2022 14:34
@mabdinur mabdinur requested a review from a team as a code owner December 15, 2022 14:34
ddtrace/propagation/http.py Outdated Show resolved Hide resolved
ddtrace/propagation/http.py Outdated Show resolved Hide resolved
ddtrace/propagation/http.py Outdated Show resolved Hide resolved
Yun-Kim
Yun-Kim previously approved these changes Dec 15, 2022
Copy link
Member

@brettlangdon brettlangdon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some minor nits and suggestions for additional tests, but otherwise lgtm

tests/tracer/test_propagation.py Show resolved Hide resolved
tests/tracer/test_propagation.py Outdated Show resolved Hide resolved
ddtrace/propagation/http.py Outdated Show resolved Hide resolved
ddtrace/propagation/http.py Outdated Show resolved Hide resolved
ddtrace/propagation/http.py Outdated Show resolved Hide resolved
ddtrace/propagation/http.py Outdated Show resolved Hide resolved
ddtrace/propagation/http.py Outdated Show resolved Hide resolved
mabdinur and others added 3 commits December 15, 2022 13:31
Co-authored-by: Brett Langdon <brett.langdon@datadoghq.com>
Co-authored-by: Tahir H. Butt <tahir.butt@datadoghq.com>
@mabdinur mabdinur enabled auto-merge (squash) December 15, 2022 18:59
@mabdinur mabdinur force-pushed the w3c/munir/fix-system-test-edgecases branch from 48b5712 to 4cebe75 Compare December 15, 2022 19:12
@mabdinur mabdinur force-pushed the w3c/munir/fix-system-test-edgecases branch 3 times, most recently from 10cfb2c to 2fa3a0a Compare December 15, 2022 20:49
@codecov-commenter
Copy link

codecov-commenter commented Dec 16, 2022

Codecov Report

Merging #4791 (1a227ab) into 1.x (95d071e) will not change coverage.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##              1.x    #4791   +/-   ##
=======================================
  Coverage   78.03%   78.03%           
=======================================
  Files         808      808           
  Lines       62062    62062           
=======================================
  Hits        48431    48431           
  Misses      13631    13631           
Impacted Files Coverage Δ
ddtrace/propagation/http.py 95.62% <100.00%> (+0.30%) ⬆️
tests/tracer/test_propagation.py 99.39% <100.00%> (-0.01%) ⬇️
ddtrace/internal/remoteconfig/worker.py 95.65% <0.00%> (-4.35%) ⬇️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

mabdinur and others added 2 commits December 19, 2022 12:12
clean up regex comments

Co-authored-by: Brett Langdon <brett.langdon@datadoghq.com>
@mabdinur mabdinur force-pushed the w3c/munir/fix-system-test-edgecases branch from bc7087b to 23c3517 Compare December 19, 2022 17:14
Yun-Kim
Yun-Kim previously approved these changes Dec 19, 2022
ddtrace/propagation/http.py Outdated Show resolved Hide resolved
Co-authored-by: Brett Langdon <brett.langdon@datadoghq.com>
@mabdinur mabdinur force-pushed the w3c/munir/fix-system-test-edgecases branch from 0dd1f3f to 38d02db Compare December 19, 2022 19:06
@brettlangdon brettlangdon merged commit a9ee02f into 1.x Dec 19, 2022
@brettlangdon brettlangdon deleted the w3c/munir/fix-system-test-edgecases branch December 19, 2022 20:30
@github-actions github-actions bot added this to the v1.8.0 milestone Dec 19, 2022
mabdinur added a commit that referenced this pull request Dec 19, 2022
## Description

Running the W3C trace context parametric tests (introduced
[here](DataDog/system-tests#691)) surfaced four
issues with how the python tracer propagates traceparent and tracestate
headers. These issues are described below and are fixed in this PR.

### Edge Case 1 (most important)
Currently `trace_parent.islower()` is used to check if a traceparent
contains lower case letters and digits (uppercase letters are invalid).
Unfortunately `trace_parent.islower()` also evaluates to False if there
are no lowercase letters are present in the traceparent (ex:
traceparent=00-12312312003-1232131321 would be marked as invalid and
dropped). This is incorrect. Fix:
ddb9dc5.

Since the traceparent should contain 48 randomly generated hex values it
is unlikely to only contain digits. This edge case is very unlikely to
occur.

Current behavior:
- `00-123123123-23123213AAA` -> uppercase letter detected traceparent is
ignored
- `00-123123123-23123213132` -> no lowercase letter detected traceparent
is ignored [WRONG]
- `00-123123123-2312322aaaa` -> lowercase letter detected traceparent is
propagated


Expected behavior:
- `00-123123123-23123213132AAAA` -> uppercase detected traceparent is
ignored
- `00-123123123-23123213132` -> no uppercase detected traceparent is
propagated
- `00-123123123-23123213132aaaa` -> no uppercase detected traceparent is
propagated


### Edge Case 2

If traceparent contains an invalid characters (ex: period) do not
propagate the traceparent. Start a new trace. Currently we can propagate
invalid trace_ids (this is because we truncate the trace_id and only
convert the last 16 digits to hex, we ignore all the characters in the
first 16 digits). Fix: ae1e73e

### Edge Case 3

Ensure that traceflags contain ONLY two digits (`00` or `01`). `000` and
`001` are examples of invalid values that should not be propagated. Fix:
adc4f80


### Edge Case 4

Ensure traceparent version contain ONLY two digits (`00` or `01`). `000`
and `001` are examples of invalid values that should not be propagated.
Fix: 2e416c5


### Edge Case 5

The W3C specification is additive. Although only traceparents with the
version `00` are supported we should attempt to parse traceparent with
different formats.

Example:
"01-12345678901234567890123456789012-1234567890123456-01-what-the-future-will-be-like"

In the traceparent example above we should parse the version, trace id,
span id, and sample flag and ignore the trailing values. Fix:
4cebe7514682787f6068754037fba569f4af3d60

### Edge Case 6

This edge case is not addressed in this PR. I am including it here for
completeness. In the W3C tracecontext specification a tracer SHOULD set
two http headers, one header should set the tracestate and the other
should set the traceparent. However, if duplicate traceparent and
tracestate headers are received, the tracer must processes and reconcile
these headers (logic: 1) receive duplicate traceparent headers with
different values -> drop these values and start a new trace. 2) receive
duplicate tracestates with different tags -> combine the tracestates and
propagate it).

In the ddtrace library http headers are added to a dictionary, so
duplicate http header values are overwritten. To address this edge case
the tracer must be able to store detect and store all key value pairs in
http headers. Since this edge case requires significant changes to
distributed tracing and does not resolve a critical issue this work can
be deferred.

## Testing 

Testing will covered by system tests.

## Reviewer Checklist
- [ ] Title is accurate.
- [ ] Description motivates each change.
- [ ] No unnecessary changes were introduced in this PR.
- [ ] Avoid breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes unless absolutely necessary.
- [ ] Tests provided or description of manual testing performed is
included in the code or PR.
- [ ] Release note has been added and follows the [library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/contributing.html#Release-Note-Guidelines),
or else `changelog/no-changelog` label added.
- [ ] All relevant GitHub issues are correctly linked.
- [ ] Backports are identified and tagged with Mergifyio.

Co-authored-by: Brett Langdon <brett.langdon@datadoghq.com>
Co-authored-by: Tahir H. Butt <tahir.butt@datadoghq.com>
brettlangdon added a commit that referenced this pull request Dec 19, 2022
## Description

Running the W3C trace context parametric tests (introduced
[here](DataDog/system-tests#691)) surfaced four
issues with how the python tracer propagates traceparent and tracestate
headers. These issues are described below and are fixed in this PR.

### Edge Case 1 (most important)
Currently `trace_parent.islower()` is used to check if a traceparent
contains lower case letters and digits (uppercase letters are invalid).
Unfortunately `trace_parent.islower()` also evaluates to False if there
are no lowercase letters are present in the traceparent (ex:
traceparent=00-12312312003-1232131321 would be marked as invalid and
dropped). This is incorrect. Fix:
ddb9dc5.

Since the traceparent should contain 48 randomly generated hex values it
is unlikely to only contain digits. This edge case is very unlikely to
occur.

Current behavior:
- `00-123123123-23123213AAA` -> uppercase letter detected traceparent is
ignored
- `00-123123123-23123213132` -> no lowercase letter detected traceparent
is ignored [WRONG]
- `00-123123123-2312322aaaa` -> lowercase letter detected traceparent is
propagated


Expected behavior:
- `00-123123123-23123213132AAAA` -> uppercase detected traceparent is
ignored
- `00-123123123-23123213132` -> no uppercase detected traceparent is
propagated
- `00-123123123-23123213132aaaa` -> no uppercase detected traceparent is
propagated


### Edge Case 2

If traceparent contains an invalid characters (ex: period) do not
propagate the traceparent. Start a new trace. Currently we can propagate
invalid trace_ids (this is because we truncate the trace_id and only
convert the last 16 digits to hex, we ignore all the characters in the
first 16 digits). Fix: ae1e73e

### Edge Case 3

Ensure that traceflags contain ONLY two digits (`00` or `01`). `000` and
`001` are examples of invalid values that should not be propagated. Fix:
adc4f80


### Edge Case 4

Ensure traceparent version contain ONLY two digits (`00` or `01`). `000`
and `001` are examples of invalid values that should not be propagated.
Fix: 2e416c5


### Edge Case 5

The W3C specification is additive. Although only traceparents with the
version `00` are supported we should attempt to parse traceparent with
different formats.

Example:

"01-12345678901234567890123456789012-1234567890123456-01-what-the-future-will-be-like"

In the traceparent example above we should parse the version, trace id,
span id, and sample flag and ignore the trailing values. Fix:
4cebe7514682787f6068754037fba569f4af3d60

### Edge Case 6

This edge case is not addressed in this PR. I am including it here for
completeness. In the W3C tracecontext specification a tracer SHOULD set
two http headers, one header should set the tracestate and the other
should set the traceparent. However, if duplicate traceparent and
tracestate headers are received, the tracer must processes and reconcile
these headers (logic: 1) receive duplicate traceparent headers with
different values -> drop these values and start a new trace. 2) receive
duplicate tracestates with different tags -> combine the tracestates and
propagate it).

In the ddtrace library http headers are added to a dictionary, so
duplicate http header values are overwritten. To address this edge case
the tracer must be able to store detect and store all key value pairs in
http headers. Since this edge case requires significant changes to
distributed tracing and does not resolve a critical issue this work can
be deferred.

## Testing 

Testing will covered by system tests.

## Reviewer Checklist
- [ ] Title is accurate.
- [ ] Description motivates each change.
- [ ] No unnecessary changes were introduced in this PR.
- [ ] Avoid breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes unless absolutely necessary.
- [ ] Tests provided or description of manual testing performed is
included in the code or PR.
- [ ] Release note has been added and follows the [library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/contributing.html#Release-Note-Guidelines),
or else `changelog/no-changelog` label added.
- [ ] All relevant GitHub issues are correctly linked.
- [ ] Backports are identified and tagged with Mergifyio.

Co-authored-by: Brett Langdon <brett.langdon@datadoghq.com>
Co-authored-by: Tahir H. Butt <tahir.butt@datadoghq.com>

Co-authored-by: Brett Langdon <brett.langdon@datadoghq.com>
Co-authored-by: Tahir H. Butt <tahir.butt@datadoghq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
changelog/no-changelog A changelog entry is not required for this PR.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants