Rationale for logs/annotations instead of spans? #8126

ecourreges-orange · 2020-06-23T14:12:56Z

Can you explain the rationale for using logs/annotations (krs... 10 total) instead of 5 additionnal spans?
The performance cannot be considered top priority, as you just need to lower your sampling rate.
Is it visualized similarly in zipkin, i.e. different spans vs multiple annotations?
What about searching for annotations?
We are using Jaeger as backend, and feel it will be hard to investigate any timing issues in kong plugins, I don't think I can search for the "time difference between annotations >= XX milliseconds" anywhere even directly in ES storage of Jaeger.
Maybe you will answer that this is more of a metric type of search.

This is what a kong-zipkin-plugin trace looks like in Jaeger-ui, you have to expand to see them, they do not appear in the timeline, so we were a bit underwhelmed I must say:

Is it worth a PR with a configuration choice between multiple spans vs annotations?

Thanks for your answers.
Regards.

hbagdi · 2020-06-23T19:41:56Z

ping @kikito @seh

seh · 2020-06-23T19:57:00Z

I've only used the Zipkin output through the OpenTelemetry Collector, sending it to Honeycomb, we get several spans representing Kong's activities, united under a single trace. I'm not sure if those log entries fall out due to not having any representation in OTel.

@ecourreges-orange, can you share another image with those spans collapsed, so that we can see the overall shape of the trace?

vito-laurenza-zocdoc · 2020-06-23T22:19:34Z

I'm interested in understanding this rationale as well.

ecourreges-orange · 2020-06-25T06:52:12Z

@ecourreges-orange, can you share another image with those spans collapsed, so that we can see the overall shape of the trace?

kikito · 2020-06-25T16:29:47Z

Spans used to be the way the different phases were stored, but they were changed to annotations as part of a wider sweep of changes in the plugin, requested with the Zipkin devs themselves in Kong/kong-plugin-zipkin#49, with more details in openzipkin/zipkin#2834

The exact place where they recommend the change is:

openzipkin/zipkin#2834 (comment)

I assume this was a performance concern, but for me it didn't make any difference so I just followed their advice.

The performance cannot be considered top priority, as you just need to lower your sampling rate.

I must point out that that solution might be appropriate for you but not for others.

Is it visualized similarly in zipkin, i.e. different spans vs multiple annotations?

This is how it looks in Zipkin:

vito-laurenza-zocdoc · 2020-06-25T16:36:18Z

This is what they used to look like in Honeycomb via the OpenTelemetry collector:

This is what they look now:

seh · 2020-06-25T16:46:17Z

I guess it's been a while since I tried a more recent version of the plugin, because my view in Honeycomb looks like your first example, @vito-laurenza-zocdoc.

That the span events in the second example don't fall within the parent span's duration bound is surprising, suggesting that these events don't make sense as children.

seh · 2020-06-25T17:01:58Z

Reading openzipkin/zipkin/pull/2834 today, I think we got this one wrong. They were saying that since of the span durations were so short, we should change them to annotations, because there's a minimum span duration. That's a rather dogmatic prescription, misinterpreting the data that showed these activities completing (in that case) too quickly to measure accurately.

If that requirement for the minimum duration is true, I'd prefer to clamp the minimum durations (report anything shorter than the minimum as the minimum, perhaps with an annotation or something indicating that the duration was clamped) and retain these as spans. They make more sense that way, being activities with a necessary sequence and duration. They are not mere events that occur instantaneously.

vito-laurenza-zocdoc · 2020-06-25T17:52:05Z

I guess it's been a while since I tried a more recent version of the plugin, because my view in Honeycomb looks like your first example, @vito-laurenza-zocdoc.

That the span events in the second example don't fall within the parent span's duration bound is surprising, suggesting that these events don't make sense as children.

Even in the first screenshot, you'll see that kong is reporting an access phase latency of ~1ms, but the span for availability-service doesn't start until ~25m relative to the start of the trace. This is very puzzling to me and I'm still not sure if we're actually adding 20-something ms of latency in the access phase, or if there's something else going on.

vito-laurenza-zocdoc · 2020-06-25T17:53:34Z

Reading openzipkin/zipkin/pull/2834 today, I think we got this one wrong. They were saying that since of the span durations were so short, we should change them to annotations, because there's a minimum span duration. That's a rather dogmatic prescription, misinterpreting the data that showed these activities completing (in that case) too quickly to measure accurately.

If that requirement for the minimum duration is true, I'd prefer to clamp the minimum durations (report anything shorter than the minimum as the minimum, perhaps with an annotation or something indicating that the duration was clamped) and retain these as spans. They make more sense that way, being activities with a necessary sequence and duration. They are not mere events that occur instantaneously.

For what it's worth, I agree that sending the phase timings as spans makes more sense.

seh · 2020-07-07T16:41:28Z

@kikito, are there any plans underway to fix this, returning us to the multiple spans that we had before?

vito-laurenza-zocdoc · 2020-07-28T18:09:38Z

@seh @kikito I re-read openzipkin/zipkin/pull/2834 several times and it seems to me that the advice around moving these spans with missing durations to annotations was misguided. The missing durations seem likely to have been caused by the prior plugin versions getting the timestamp precision wrong (milliseconds vs microseconds). With microsecond precision, I doubt we'd even see a span with a duration less than 1 microsecond.

kikito · 2020-08-05T10:59:24Z

The suggestions made here make sense. This change would make the plugin non-backwards compatible (nothing would "break", but the stats provided would be very different). So I would need to release a major version. I would like to get feedback from @adriancole @jcchavezs before acting on this.

codefromthecrypt · 2020-08-05T11:50:39Z

has anyone asked jaeger to do relative timings to the span? I would be surprised if they would think it is better to display as it is, though possibly there's a perspective I'm missing. cc'ing someone familiar w/jaeger @pavolloffay

codefromthecrypt · 2020-08-05T11:51:26Z

one thing I've noticed in tech is that people will move mountains to avoid fixing javascript in a UI :P

codefromthecrypt · 2020-08-05T12:02:03Z

on the specific parts here, there seem to have been changes after this one which makes the data more esoteric Kong/kong-plugin-zipkin#52

ex. the three character annotations I'm not sure anyone would understand what they are, and it would be unexpected for a UI to have a schema to know what they are. I would suggest to sortof split up issues.

The first requests to change the data were around semantics and attempts to correct those primarily. There are different ways to model but I would agree that regardless of that UIs probably do need to fix how they present "relative time", 3 character annotations are unusable as would be spans with 3 character names.

First step might be seeing about undoing wherever that 3 char part happened. Regardless of whether 3-10 spans are re-added per request, I wouldn't roll back completely the other parts as there were definitely semantic problems, ex endpoints being incorrect and such.

Focusing on the model we want is probably better than reverting as hours were spent on problems that I'm not sure anyone would want to spend the time to do-over again.

codefromthecrypt · 2020-08-05T12:06:37Z

one other thing I would suggest is sometimes translators are used.. I think this problem will be easiest to discuss with actual zipkin json apples/apples as maybe applied to the other 2 UIs. Ex if some proxy is used, the parts besides the 3 letter thing might become obvious. It is weird that some annotations are predating the corresponding spans. JSON is how we usually get to bottom of things very quickly.

jcchavezs · 2020-08-05T12:40:30Z

I am curious how other UIs impact a zipkin plugin. I do agree the three letters annotation is very hard to digest. The change was introduced in https://github.com/Kong/kong-plugin-zipkin/pull/60/files, not sure what was the motivation (probably following zipkin convention) so @kikito can explain more. I would bring back the previous names as they are more easy to understand,

Regarding using different spans or annotations, the operations performed by kong are sequencial and one depend on the other. Are spans a better modeling for it? I don't think so, IMHO it adds more noise and honestly it is not too interesting see them independently but more like a collections of milestones (but that is personal opinion). Do we need to change the model because rendering in vendor XYZ is bad? I don't think so

seh · 2020-08-05T12:43:05Z

Are spans a better modeling for it? I don't think so, IMHO it adds more noise and honestly it is not too interesting see them independently but more like a collections of milestones (but that is personal opinion).

I think spans are a better representation for these efforts. Thinking of how these wind up in Honeycomb after passing through the OpenTelemetry Collector, they are more easily queried, visualized, and understood when modeled as spans.

jcchavezs · 2020-08-05T12:48:18Z

As I pointed out above, do we really need to change this plugin because of how honeycomb renders it? ons. 5. aug. 2020, 14:43 skrev Steven E. Harris <notifications@github.com>:

…

Are spans a better modeling for it? I don't think so, IMHO it adds more noise and honestly it is not too interesting see them independently but more like a collections of milestones (but that is personal opinion). I think spans are a better representation for these efforts. Thinking of how these wind up in Honeycomb after passing through the OpenTelemetry Collector, they are more easily queried, visualized, and understood when modeled as spans. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <https://github.com/Kong/kong-plugin-zipkin/issues/89#issuecomment-669170859>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAXOYARQZL27AA6BMLMR3MLR7FHWPANCNFSM4OFWWBLA> .

seh · 2020-08-05T12:56:26Z

As I pointed out above, do we really need to change this plugin because of how honeycomb renders it?

We don't need to do anything, but I think the plugin worked better before, so why not recover that better behavior that users had been enjoying?

I confess that I don't understand the position that represents these Kong efforts as annotations. These efforts are steps that have durations, and we care how long each step takes, and how many of the steps we attempt and accomplish. If we claim that such efforts should be annotations, then what would be left deserving a span?

ecourreges-orange · 2020-08-05T13:26:24Z

Apart from the "We don't need to care about other GUIs", explain us how you can investigate/search timing issues in Kong plugins with this annotation model?
I understand this was started as a Zipkin plugin for Kong, but you all know that OpenTelemetry is coming, and it would be sad to have to write a new kong-plugin-opentelemetry where just a few tweaks to this plugin are enough to reach a consensus for zipkin AND non zipkin users.

kikito · 2020-08-07T16:38:57Z

I don't want to derail this thread too much so let me address the "three-letter-label" question first.

I am curious how other UIs impact a zipkin plugin. I do agree the three letters annotation is very hard to digest. The change was introduced in Kong/kong-plugin-zipkin#60 (files), not sure what was the motivation (probably following zipkin convention) so @kikito can explain more. I would bring back the previous names as they are more easy to understand,

I confirm that the intention renaming annotations to kas etc. from kong.access.start was aligning the plugin with Zipkin's seemingly preferred style (where cs is used instead of something longer like client.send). In addition to this, the zipkin-api docs, which state:

Usually a short tag indicating an event, like “error”

The impression I got from all that was that the annotation bytesize should be kept constrained. Since I was going to need several annotations per trace, I went with 3-letters.

I see now that this was a mistake. So in a future version of the plugin we will have longer names. Thanks for clarifying this.

kikito · 2020-08-07T16:55:04Z

Going back to the main theme of this issue, i.e. "using spans vs using annotations for kong's individual phases", I am getting the impression that using spans would have several advantages:

Other tracing software handles spans better than annotations.

they are more easily queried, visualized, and understood when modeled as spans.

While supporting other tracing software is not the highest priority for this plugin, if we can, I think we should (so long as it doesn't impede our capacity for properly supporting Zipkin).

It seems to me that even Zipkin itself can handle them better (it can use spanName as a filtering criteria, but not annotationName).

So I am inclined to say that yes, we should go back to using spans for kong phases.

kikito · 2020-08-07T16:55:10Z

@seh:

That the span events in the second example don't fall within the parent span's duration bound is surprising, suggesting that these events don't make sense as children.

In older versions of the plugin, timestamps were not correctly transformed into microseconds (#71) so that might explain some discordances. Kong 2.1 (recently released) should not show those.

kikito · 2020-08-07T19:39:46Z

Just another heads up: My holidays start now. I told my team to keep an eye on this particular issue, feel free to keep the discussion going.

seh · 2020-10-01T17:49:20Z

I read the comments accumulated here so far as consensus among the interested parties that we should be using spans to represent Kong's efforts. If that's so, what do we need to do to restore that previous behavior?

kikito · 2020-10-21T17:42:10Z

@seh Well if someone feels adventurous enough to try their hand at this, PRs are welcome. Otherwise this will happen when it gets prioritized. We all have been working on the 2.2.0 features recently and plugin features took a backward seat for the moment unfortunately.

evandam · 2021-04-23T17:34:20Z

Hey folks, I'm a little late to the party but was wondering if this has been looked at recently. Like others have already mentioned, it's surprising to me how difficult it is to do things like query by the downstream services since everything just rolls up to the service being hard-coded to "Kong." I'm also hoping to send these traces to a separate tool (Instana, but also testing with Jaeger), but in their current format it doesn't seem feasible.

cc @seh @kikito - appreciate the work on this!

kikito · 2021-05-19T17:33:21Z

@evandam I think what you are asking is related but is not exactly this issue, but #8125.

gszr transferred this issue from Kong/kong-plugin-zipkin Nov 30, 2021

gszr added the plugins/zipkin label Nov 30, 2021

dcarley mentioned this issue Jan 6, 2022

RFC: feat(zipkin) Use tags instead of phases for annotations #8264

Closed

mayocream mentioned this issue Dec 6, 2022

feat(zipkin): add support for labeling phase duration as tags #9891

Merged

3 tasks

flrgh closed this as completed in #9891 Dec 9, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rationale for logs/annotations instead of spans? #8126

Rationale for logs/annotations instead of spans? #8126

ecourreges-orange commented Jun 23, 2020

hbagdi commented Jun 23, 2020

seh commented Jun 23, 2020

vito-laurenza-zocdoc commented Jun 23, 2020

ecourreges-orange commented Jun 25, 2020

kikito commented Jun 25, 2020

vito-laurenza-zocdoc commented Jun 25, 2020

seh commented Jun 25, 2020

seh commented Jun 25, 2020 •

edited

Loading

vito-laurenza-zocdoc commented Jun 25, 2020

vito-laurenza-zocdoc commented Jun 25, 2020

seh commented Jul 7, 2020

vito-laurenza-zocdoc commented Jul 28, 2020 •

edited

Loading

kikito commented Aug 5, 2020 •

edited

Loading

codefromthecrypt commented Aug 5, 2020

codefromthecrypt commented Aug 5, 2020

codefromthecrypt commented Aug 5, 2020

codefromthecrypt commented Aug 5, 2020

jcchavezs commented Aug 5, 2020

seh commented Aug 5, 2020

jcchavezs commented Aug 5, 2020 via email

seh commented Aug 5, 2020

ecourreges-orange commented Aug 5, 2020

kikito commented Aug 7, 2020

kikito commented Aug 7, 2020 •

edited

Loading

kikito commented Aug 7, 2020

kikito commented Aug 7, 2020

seh commented Oct 1, 2020

kikito commented Oct 21, 2020

evandam commented Apr 23, 2021

kikito commented May 19, 2021

Rationale for logs/annotations instead of spans? #8126

Rationale for logs/annotations instead of spans? #8126

Comments

ecourreges-orange commented Jun 23, 2020

hbagdi commented Jun 23, 2020

seh commented Jun 23, 2020

vito-laurenza-zocdoc commented Jun 23, 2020

ecourreges-orange commented Jun 25, 2020

kikito commented Jun 25, 2020

vito-laurenza-zocdoc commented Jun 25, 2020

seh commented Jun 25, 2020

seh commented Jun 25, 2020 • edited Loading

vito-laurenza-zocdoc commented Jun 25, 2020

vito-laurenza-zocdoc commented Jun 25, 2020

seh commented Jul 7, 2020

vito-laurenza-zocdoc commented Jul 28, 2020 • edited Loading

kikito commented Aug 5, 2020 • edited Loading

codefromthecrypt commented Aug 5, 2020

codefromthecrypt commented Aug 5, 2020

codefromthecrypt commented Aug 5, 2020

codefromthecrypt commented Aug 5, 2020

jcchavezs commented Aug 5, 2020

seh commented Aug 5, 2020

jcchavezs commented Aug 5, 2020 via email

seh commented Aug 5, 2020

ecourreges-orange commented Aug 5, 2020

kikito commented Aug 7, 2020

kikito commented Aug 7, 2020 • edited Loading

kikito commented Aug 7, 2020

kikito commented Aug 7, 2020

seh commented Oct 1, 2020

kikito commented Oct 21, 2020

evandam commented Apr 23, 2021

kikito commented May 19, 2021

seh commented Jun 25, 2020 •

edited

Loading

vito-laurenza-zocdoc commented Jul 28, 2020 •

edited

Loading

kikito commented Aug 5, 2020 •

edited

Loading

kikito commented Aug 7, 2020 •

edited

Loading