Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add test for child span sampling issue #264

Merged
merged 3 commits into from
Aug 11, 2021

Conversation

dvic
Copy link
Contributor

@dvic dvic commented Aug 6, 2021

No description provided.

@dvic dvic requested a review from a team August 6, 2021 13:40
@codecov
Copy link

codecov bot commented Aug 6, 2021

Codecov Report

Merging #264 (bfdae7f) into main (c24b67c) will increase coverage by 0.03%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #264      +/-   ##
==========================================
+ Coverage   36.39%   36.42%   +0.03%     
==========================================
  Files          37       37              
  Lines        3168     3168              
==========================================
+ Hits         1153     1154       +1     
+ Misses       2015     2014       -1     
Flag Coverage Δ
api 62.90% <ø> (ø)
elixir 16.05% <ø> (ø)
erlang 36.42% <ø> (+0.03%) ⬆️
exporter 19.60% <ø> (ø)
sdk 79.10% <ø> (+0.17%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
apps/opentelemetry/src/otel_resource_detector.erl 92.85% <0.00%> (+1.42%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c24b67c...bfdae7f. Read the comment docs.

@dvic
Copy link
Contributor Author

dvic commented Aug 6, 2021

The reason why it's returning SampledFlag == true and IsRecording == false is bevause is_recording is set to false when end_span is called, which is also mentioned in the spec. So this rule only applies to the result of the sampling procedure?

@dvic
Copy link
Contributor Author

dvic commented Aug 6, 2021

Anyways, I found the bug, the parent based sampler pattern matched on is_remote=false, but it was undefined.

@dvic dvic changed the title Add failing test for child span sampling issue Fix parent-based sampler picking the wrong sampler Aug 6, 2021
dvic added a commit to qdentity/opentelemetry-erlang that referenced this pull request Aug 6, 2021
@dvic dvic mentioned this pull request Aug 6, 2021
@@ -135,12 +135,11 @@ parent_based_sampler(#span_ctx{trace_flags=TraceFlags,
parent_based_sampler(#span_ctx{is_remote=true}, #{remote_parent_not_sampled := SamplerAndOpts}) ->
SamplerAndOpts;
%% local parent sampled
parent_based_sampler(#span_ctx{trace_flags=TraceFlags,
is_remote=false}, #{local_parent_sampled := SamplerAndOpts})
parent_based_sampler(#span_ctx{trace_flags=TraceFlags}, #{local_parent_sampled := SamplerAndOpts})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But now this is returning the local_parent_sampled sampler but not verifying that the span is not remote?

Is the issue instead that is_remote should be being set to false instead of undefined?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was not because is_remote=true was pattern matched above. I have now changed the fix by adding is_remote=false in root_span_ctx(), not sure if this is the only place. The typespec by the way is still boolean() | undefined, not sure if you want to leave it like that or search also for other places and make it boolean().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is get_ctx in otel_span_ets, which is another potential candidate for setting is_remote=false.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've got to check the spec this morning. This is likely a result of the multiple iterations done in the spec on how remote spans were created and stored. At one point they were stored in a separate context key from the current span context key and then I think that was switched to a remote field, but it has been so long I need to refamiliarize myself.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wait, now its coming back to me. The is remote was not needed also when using what are called "nonrecording spans" for the extracted remote spans. But the spec does still include is_remote. I'm not 100% if is_remote=true can simply be set in non_recording_span/3 or if it needs to be done in the propagator itself... But either way, one of those two are the fix.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah in that case the initial fix (not setting is_remote=true but implicitly checking for false or undefined) makes more sense to me. We can also change it to explicitly check for false and undefined? Let me know what you prefer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is my PR that I think will fix it. I went with defaulting is_remote to false #265

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Do you want to keep the Elixir test in this PR or make a erlang test in #265 and can this PR be closed?

@tsloughter
Copy link
Member

The test is good to keep, just need the is_remote part removed from span utils in this PR.

Can you also rebase this down to 1 commit instead of 4?

@dvic
Copy link
Contributor Author

dvic commented Aug 8, 2021

The test is good to keep, just need the is_remote part removed from span utils in this PR.

Can you also rebase this down to 1 commit instead of 4?

Done :)

@dvic dvic changed the title Fix parent-based sampler picking the wrong sampler Add test for child span sampling issue Aug 9, 2021
@tsloughter tsloughter merged commit 24804a7 into open-telemetry:main Aug 11, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants