-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Trace context propagation using non-gRPC headers #6144
Comments
FYI in brave, we "check twice" inbound preferring grpc-trace-bin https://github.com/openzipkin/brave/blob/master/instrumentation/grpc/src/main/java/brave/grpc/GrpcPropagation.java#L125 Outbound we dual propagate by default as it isn't clear what the otherside will use for reasons you mention. In the future someone could gain special knowledge via backtrace or such, but anyway now we dual-propagate. |
cc: @zhangkun83 @ejona86 for thoughts / help triaging |
As you already pointed out, it is possible to do what you need with our existing API. But I guess what you're saying here is you want to be able to use a pluggable wire transport with the Census integration. Dual propagation seems quite reasonable given the current state of things. Ideally there would just be one format, but b3 doesn't seem like a great one. It is simple, but I'm confident using 5 headers would bother some performance people and be noticeable in benchmarks. Forking CensusTracingModule doesn't bother me too much. Short term, we could make some changes to it easier to fork. Maybe even including just two hard-coded booleans on whether to use text and/or binary format. We've been looking at removing the Census dependency from core for quite a while (#5076 is tracking that). But I'm not sure that helps. In this particular case, I wouldn't be wild about making the wire transport pluggable. Instead I'd prefer to just support both formats directly. That's less API and both formats are supported by opencensus directly. I'm not sure exactly how we would want to expose this API though... |
Thanks for the reply, @ejona86. Some thoughts:
Yeah, this is accurate. More specifically, Open Census - but we're splitting hairs on the naming here (my understanding is that Open Census was born out of gRPC).
I'm not sure I agree with you on having dual propagation. Here's where it's breaking down in our deployment currently: We have intermediary proxies (specifically Envoy, configured via Istio) that only know how to handle specific propagation formats (currently, this is limited to B3). In the case where a gRPC client sets both A preferable solution in our case would just be to allow operators to pick what their trace propagation format looks like.
To this extent, I think that supporting more than one format would be great. Maybe the API for this is even such that if you want to propagate both, you can do that via your own implementation of some abstract class, interface, etc. Therefore, to firm up my proposal some more: allow an operator to configure their gRPC clients and servers with whatever propagation format they'd like. By default, Open to thoughts on what the API should look like, especially if there's work being done to re-work the existing tracing infrastructure. As a footnote:
I don't think it necessarily has to use 5 headers, based on what I've read here. This could probably appease the performance nuts to some extent (though it's still a text encoding). Reading this lead me to discovering that theres's a W3 candidate for "Trace Context" that specifies a common format to use. The API above should be flexible enough to allow for a "propagator" to be defined that could adhere to this spec too, if one so wished. |
Dual propagation seems quite reasonable given the *current* state of
things. Ideally there would just be one format, but b3 doesn't seem like a
great one. It is simple, but I'm confident using 5 headers would bother
some performance people and be noticeable in benchmarks.
There's been a single header variant available for quite a while now
https://github.com/openzipkin/b3-propagation#single-header
|
The single-header version looks fair. I don't expect binary vs text would be a problem. Having three is worrying, but they are still all explicitly supported in the PropagationComponent API. I really want census to define as much as this. Maybe we let someone pass a BinaryFormat or (at their choice) a TextFormat. Such an API would definitely need to be in its own artifact (not in grpc-api/grpc-core). (Looking at BinaryFormat more, it would also require someone to specify the Key; I'm not sure if that is too useful.) For the record, we can't depend on io.opencensus.trace.propagation.TextFormat at the moment because it is unstable API. We'd need to work together to get that stabilized. |
@ejona86 in terms of "standard" headers I think you should look into using w3c standard headers that defined for this https://www.w3.org/TR/trace-context @nicktrav I suggest you use Envoy with opencensus integration. If you look at the config for opencensus we do have something that will help you - we allow to set the incoming and outgoing formats, so you can set b3 as incoming and grpc-bin as outgoing, see: |
Thanks @bogdandrutu - this would indeed be useful. However, we're using Istio for the configuration of these Envoy's, so that would need to be exposed via Istio. Furthermore, for it to be useful for us, we'd need some way of being able to granularly define the mappings per service based on what each service speaks. It's not clear if that's possible. It would be simpler for us to just adopt a single, standard tracing format. If we adopted the |
@nicktrav in Brave, we've run across the need to handle things more granularly. For example, instead of working on the "Metadata" object, you can work on a "Request" object. This is tricky in google's gRPC library because there is no effective request type.. data you need to perform selections on what to propagate is spread out over many types in different scopes. We have a related issue to try to cobble them together so that you can, for example, change the propagation policy on a per-request basis, and without affecting user code openzipkin/brave#993 |
Currently, gRPC uses the
grpc-trace-bin
header for context propagation across process boundaries. This works perfectly for an environment in which all services are using gRPC for service-to-service communication. In the case of a more "polyglot transport protocol" environment, preserving the trace context becomes a little more involved.To give some more specific context / motivation for the problem at hand, we have inbound HTTP requests that are subsequently proxied by an "API gateway"-like service over gRPC to respective backends. All other downstream traffic is gRPC. We're running this in a "mesh" setup, with Envoy running as a proxy alongside each container.
Envoy is configured (via Isito) to look for B3 headers, and will correctly do the context propagation, for HTTP. It doesn't know about
grpc-trace-bin
and thus can't participate in these traces. Instead it will emit its own B3-flavored trace for the particular hop, and we end up with incomplete traces - one set for the client / server gRPC spans, and another for just the proxy hops.I'd like to propose / seek feedback on the idea of making the wire transport for tracing configurable / pluggable. I'm coming at this from the perspective that gRPC's trace propagation is tightly coupled to a gRPC-specfic format and it doesn't really make sense (at least to me) to teach an HTTP server / proxy how to handle gRPC's internal format.
Maybe I'm missing some background. I did a quick look in the issues for this repo but didn't see anything obvious. Looks like something similar has been brought up in the Go community, via census-instrumentation/opencensus-go#666, and census-instrumentation/opencensus-specs#136 (closed out).
If this is better for a list, happy to post there too, jlmk where to ask.
cc: @adriancole
More detail / proof of concept for a Java service:
Basic idea of the change - Open Census has the concept of Binary and TextFormat "setters" and "getters" to do the propagation. We replaced the
grpc-trace-bin
propagation header with the full set of B3 headers (no reason this couldn't be the single B3 header).I've proven this out internally with some light forking of the code to make a new tracing module that does the B3 propagation. The following code is an example, isn't production ready, etc. etc.
We're not actually forking gRPC for this, rather we're making a new tracing module that has the above changes, which we then install into client / server pipelines:
We then have to disable the out of the box-gRPC-tracing so that we're only using B3 for propagation:
With these code changes, we get full end-to-end tracing with all clients, servers and sidecar Envoys adding their spans.
The text was updated successfully, but these errors were encountered: