-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(plugin): OTEL exporting with translate #10332
Conversation
b2ac59b
to
2e85bd2
Compare
I've updated the PR with a couple fixes. I think it meets the correct requirements now, but there are some warts:
|
Answering question 2: the copied span won't survive the request so it will be GC-ed, and its reference to the members will also be gone. So the only reference to those members should be from the original span, which will be returned to the pool. If the PDK decides to also reuse the members, it will still be fine as the table returned to the pool has reference to the members. PS: table pool is just referencing returned tables to prevent them from GC-ed and has no special mechanism, thus we cannot "GC a table that is supposed to be returned to the pool". The worst situation is that we returned tables that are used somewhere else unaware of them being recycled which cannot happen here. |
To answer question 1: I agree that we need a test for the truncating case. But the code you're referring to (handling to OT header) does not mean truncating doesn't work. We're translating traces, and a hypothetical trace format may have a valid trace ID larger than that. |
However there's not a type of ID that is long enough to go through the truncating path, so we skip it for now. |
We represent trace spans with a universal format for all kinds of mainstream tracers and translate them to different formats when propagating headers and exporting traces. However, OTEL misses the handling of trace ID length. Fix KAG-652
93d3168
to
4aba447
Compare
Co-authored-by: Chrono <chrono_cpp@me.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could do more code clean for it, but I think it is OK now.
kong/plugins/opentelemetry/otlp.lua
Outdated
@@ -72,9 +77,39 @@ local function transform_events(events) | |||
return pb_events | |||
end | |||
|
|||
-- this function is to translate the span to otlp span, but of universal span format |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand what "but of universal span format" means.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original design to support arbitrary types of tracing plugins has not made it clear:
- We have an internal representation for span, and it should preserve what it received from downstream requests at best effort;
- when exporting the data to collectors, it needs to transform the span to meet the requirement with its field (DD, OTEL, etc), and then translate it into a format used for the reporting API (gRPC, JSONRPC, etc). This function does the first step but not the second one.
span = translated | ||
|
||
return span |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick, not needed if we don't do a shallow copy.
span = translated | |
return span | |
return translated |
new_id = NULL:rep(TRACE_ID_LEN - len) .. trace_id | ||
end | ||
|
||
local translated = shallow_copy(span) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is a shallow_copy
needed?
I would have expected this function to do something like:
local function adjust_trace_id_for_oltp(trace_id)
And the usage would be, when creating the root span:
root_span.trace_id = adjust_trace_id_for_oltp(trace_id)
And inside transform_span
:
local pb_span = {
trace_id = adjust_trace_id_for_oltp(span.trace_id),
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I don't recommend using abstract names (like transform_span
or translate_span
) when the function is only doing one modification on one field. The tradeoff ("making the code easier to modify in the future if we find that we need to do more modifications in the future") is not worth the upfront cost ("making the code more difficult to read now"). Future us can always group several modifications into one if the need arises.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is a
shallow_copy
needed?
The span passed in may also be used by other plugins. We need to preserve as much original information as possible, and only do the transformation for exporting/propagating.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree naming is a problem, but transform_span
is used as some kind of interface for tracing plugins (we can see a function with the same name from datadog-tracing).
I'm adding comments and renaming the internal function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And the function is in the file "otlp.lua" so I think we do not need to suffix it with "_otlp".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And the usage would be, when creating the root span:
We cannot just modify the root_span
given the reason: https://github.com/Kong/kong/pull/10332/files#r1115411177
Co-authored-by: Chrono <chrono_cpp@me.com> Co-authored-by: Michael Martin <3277009+flrgh@users.noreply.github.com> (cherry picked from commit 6fa55b3)
Co-authored-by: Chrono <chrono_cpp@me.com> Co-authored-by: Michael Martin <3277009+flrgh@users.noreply.github.com> (cherry picked from commit 6fa55b3) Co-authored-by: Xumin <100666470+StarlightIbuki@users.noreply.github.com> Co-authored-by: Enrique García Cota <kikito@gmail.com>
Summary
We represent trace spans with a universal format for all kinds of mainstream tracers and translate them to different formats when propagating headers and exporting traces. However, OTEL misses the handling of trace ID length.
Checklist
N/A There is a user-facing docs PR against https://github.com/Kong/docs.konghq.com - PUT DOCS PR HERE
Issue reference
Fix KAG-652