Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create shared context for updating span pipeline from TracerProvider and affecting Tracer. #650

Merged
merged 21 commits into from
Apr 16, 2021

Conversation

jsuereth
Copy link
Contributor

@jsuereth jsuereth commented Apr 5, 2021

  • Create new TracerContext class that stores (thread-safe) access to Sampler + SpanPrcoessor.
  • All calls to SpanProcessor go through TracerContext, ensuring we abide by specification where TracerProvider calls MUST impact previously created Tracers
  • Update all examples for new API.
  • Adds "stub" API for appending new SpanProcessors to a given TracerContext.

Replaces #590 with only the shared context portion.

@jsuereth jsuereth marked this pull request as ready for review April 5, 2021 20:27
@jsuereth jsuereth requested a review from a team April 5, 2021 20:27
@codecov
Copy link

codecov bot commented Apr 5, 2021

Codecov Report

Merging #650 (a0999fc) into main (b0f27a4) will decrease coverage by 0.01%.
The diff coverage is 90.41%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #650      +/-   ##
==========================================
- Coverage   94.66%   94.65%   -0.02%     
==========================================
  Files         204      206       +2     
  Lines        9418     9381      -37     
==========================================
- Hits         8916     8880      -36     
+ Misses        502      501       -1     
Impacted Files Coverage Δ
sdk/src/trace/span.h 100.00% <ø> (ø)
sdk/src/trace/tracer_provider.cc 76.47% <72.72%> (+8.28%) ⬆️
sdk/src/trace/tracer_context.cc 75.00% <75.00%> (ø)
ext/test/zpages/tracez_data_aggregator_test.cc 96.35% <100.00%> (ø)
ext/test/zpages/tracez_processor_test.cc 98.71% <100.00%> (+<0.01%) ⬆️
...clude/opentelemetry/sdk/common/atomic_unique_ptr.h 100.00% <100.00%> (ø)
sdk/include/opentelemetry/sdk/trace/tracer.h 100.00% <100.00%> (ø)
sdk/src/trace/span.cc 89.58% <100.00%> (ø)
sdk/src/trace/tracer.cc 76.92% <100.00%> (+1.92%) ⬆️
sdk/test/trace/tracer_provider_test.cc 100.00% <100.00%> (ø)
... and 5 more

Copy link
Member

@lalitb lalitb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thank for the changes.

"//exporters/ostream:ostream_span_exporter",
"//sdk/src/trace",
],
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding the bazel build for this.

// 1. If existing processor is an "AggregateProcessor" append the new processor to it.
// 2. If the existing processor is NOT an "AggregateProcessor", create a new Aggregate of this and
// the other,
// then replace our atomic ptr with the new aggregate.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though it would be separate PR, just curious how to figure if existing processor is AggregateProcessor :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, wow, yeah just copy-pasted the documentation. I'll rephrase this for now until the next PR lands.

Copy link
Contributor

@seemk seemk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I apologize for being out of the loop a bit, but what was the problem this (and #590) solved in the context of Resource SDK? Just to get up to speed, thanks!

sdk/include/opentelemetry/sdk/trace/tracer_context.h Outdated Show resolved Hide resolved
sdk/include/opentelemetry/sdk/trace/tracer_context.h Outdated Show resolved Hide resolved
@lalitb
Copy link
Member

lalitb commented Apr 7, 2021

I apologize for being out of the loop a bit, but what was the problem this (and #590) solved in the context of Resource SDK? Just to get up to speed, thanks!

@seemk - This PR is to ensure that the tracer-provider has sole ownership of all the configuration ( sampler, processor, resources ). You can follow the discussions in #580 specifically these comments:

#580 (comment)
#580 (comment)
#580 (comment) ( the final proposal which is implemented in this PR).

Copy link
Contributor

@pyohannes pyohannes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing that work and putting this chunk into its own PR. This makes it much easier to reason about and to review.

Comment on lines +51 to +56
* Obtain the (conceptual) active processor.
*
* Note: When more than one processor is active, this will
* return an "aggregate" processor
*/
SpanProcessor &GetActiveProcessor() const noexcept;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would be an inactive processor?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there is any inactive processor. @jsuereth can confirm though :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'inactive' would be any constructed but not registered processor.

I'm not tied to this name, just took it from Java ;)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps just name it GetProcessor to avoid the confusion.

* Note: When more than one processor is active, this will
* return an "aggregate" processor
*/
SpanProcessor &GetActiveProcessor() const noexcept;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might theoretically run into issue here when some tracer does some work with a processor obtained by GetActiveProcessor, while another thread calls RegisterPipeline and thus causes the memory pointed to by the obtained the processor to be freed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't be the case. The interface is "RegisterPipeline", and it's a shared pointer. The TracerContext can guarantee that ALL registered processors remain alive for its own lifecycle. I.e. we're not going to delete a pipeline, we're only going to allow registration of new ones specifically to prevent this issue.

We have the inverse problem around Recordable though, specifically if we have an "active" span that did NOT get a recordable registered when a new pipeline is registered, what happens when that span ends?

If you look at my design around "ExportableSpan" where a pipeline can ask for its instance of Recordable (and handle a missing recordable), that's how I planned to solve that issue.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I understand. So if a new pipeline is registered, the old one isn't just overwritten, but kept in store and just made inactive?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding was that all the pipelines(i.e, processors) remain active. And we can only add a new pipeline, but can't remove the existing one? And once we add multi-processor support, TracerContext would actually maintain a unique_ptr to (say) CompositeSpanProcessor which further manages all the active pipelines/processors?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. i'm adopting the design I've seen elsewhere where you can "append a pipeline", but can't remove an existing pipeline.

I.e. once a pipeline is active, it's forever active. We don't have a register/unregister mechanism.

const opentelemetry::sdk::resource::Resource &GetResource() const noexcept;

/**
* Force all active SpanProcessors to flush any buffered spans
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit confused by all active span processor. Wouldn't we just have a CompositeSpanProcessor? Which would encapsulate flushing and shutdown? So that this is a thing the CompositeSpanProcessor has to worry about, and not the TraceContext?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I wouldn't mind leaving ForceFlush and Shutdown out. GetActiveProcessor().ForceFlush() isn't that bad.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for users of TracerContext to understand what it does. Yes the implementation of this behavior may be implemented by "CompositeSpanProcessor" but it's not something that would be directly exposed, instead users just see "RegisterPipeline" and "Shutdown" and "ActiveProcessor". I.e. we still need to document the behavior of this class and how to use its methods.

Note: I'm not tied to "active" processor, I borrowed the name from the Java SDK.

* @return The sampler for this tracer provider.
*/
std::shared_ptr<Sampler> GetSampler() const noexcept;
void RegisterProcessor(std::unique_ptr<SpanProcessor> processor) noexcept;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have RegisterProcessor here and RegisterPipeline on the TraceContext. Could we unify the terminology here? I don't have a strong preference as to which term to use.

@@ -58,18 +58,15 @@ trace_api::SpanId GenerateRandomSpanId()
}

Span::Span(std::shared_ptr<Tracer> &&tracer,
std::shared_ptr<SpanProcessor> processor,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a reason why the processor was put onto the span: this is to guarantee that the span/recordable gets ended on the same processor it was created with.

Otherwise, this could happen:

  1. A span/recordable a is created from processor/exporter A.
  2. The processor/exporter is changed, maybe replaced by a composite processor/exporter B.
  3. When the span/recordable a is ended it is passed to the processor/exporter B and cannot be handled in this context (as it was created from processor/exporter A).

We should decide if that's still a use case we want to support. Just putting it out there that this was considered in the initial design.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I had a different proposal where a processor asks for its recordable back later and receives a nullptr (or other "false" flag).

I can revert this for the other design. I think this is a use case that needs support, there's just more than one option for how to do it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mind which option we take. I'm in favor though of guaranteeing that the Recordable passed to the exporter is of the correct type.

@lalitb
Copy link
Member

lalitb commented Apr 14, 2021

@jsuereth - Would like to merge this PR once you and @pyohannes think it's good to go , that will allow me to raise PR for #664 on top of these changes :)

@jsuereth
Copy link
Contributor Author

@lalitb Understood. So, to make progress, it'd be nice to know which of @pyohannes comments are blocking and which to address now.

  • Active Processor terminology. I don't see a counter proposal and i haven't thought of another term. I'd vote to leave it similar to Java unless there's an alternative.
  • Keeping SpanPrcoessor on Span vs. relying on "active span processor" in tracer. I'm ok going either way here. my preference is what I've done in this PR but I'm happy to change it.
  • I can update to consistently use RegisterPipeline everywhere as the spec uses the term 'pipeline" to denote processor+exporter pairing.

WDYT @pyohannes ?

@pyohannes
Copy link
Contributor

Please let's use RegisterPipeline consistently wherever appropriate.

Otherwise I'm fine with the current approach. I was a bit confused by the missing RegisterPipeline implementation, but I think my main concerns will be handled by a proper implementation of this method and aggregate processors.

Copy link
Contributor

@pyohannes pyohannes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for getting this on the way.

Copy link
Contributor

@ThomsonTan ThomsonTan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update change log?

@@ -178,8 +178,12 @@ class TracezProcessor : public ::testing::Test
shared_data = std::shared_ptr<TracezSharedData>(new TracezSharedData());
processor = std::shared_ptr<TracezSpanProcessor>(new TracezSpanProcessor(shared_data));
auto resource = opentelemetry::sdk::resource::Resource::Create({});
// Note: we make a *different* processor for the tracercontext. THis is because
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Note: we make a *different* processor for the tracercontext. THis is because
// Note: we make a *different* processor for the tracercontext. This is because

* Attaches a span processor to this tracer context.
*
* @param processor The new span processor for this tracer. This must not be
* a nullptr. Ownership is given to the `TracerContext`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* a nullptr. Ownership is given to the `TracerContext`.
* a nullptr. Ownership is given to the `TracerContext`.

Comment on lines +51 to +56
* Obtain the (conceptual) active processor.
*
* Note: When more than one processor is active, this will
* return an "aggregate" processor
*/
SpanProcessor &GetActiveProcessor() const noexcept;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps just name it GetProcessor to avoid the confusion.

@lalitb
Copy link
Member

lalitb commented Apr 15, 2021

@lalitb Understood. So, to make progress, it'd be nice to know which of @pyohannes comments are blocking and which to address now.

  • Active Processor terminology. I don't see a counter proposal and i haven't thought of another term. I'd vote to leave it similar to Java unless there's an alternative.
  • Keeping SpanPrcoessor on Span vs. relying on "active span processor" in tracer. I'm ok going either way here. my preference is what I've done in this PR but I'm happy to change it.
  • I can update to consistently use RegisterPipeline everywhere as the spec uses the term 'pipeline" to denote processor+exporter pairing.

WDYT @pyohannes ?

Thanks for summarising @jsuereth. don't see these as blockers - we can merge the PR once the formatting issue is fixed as that is blocking merge :)

@lalitb lalitb merged commit 7b07013 into open-telemetry:main Apr 16, 2021
@jsuereth jsuereth deleted the wip-trace-context branch April 19, 2021 19:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants