-
Notifications
You must be signed in to change notification settings - Fork 164
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Trace Payload Collection #234
base: main
Are you sure you want to change the base?
Conversation
(for example, `http.request.body`): | ||
|
||
- **Data attribute**: `<attribute>.content`. Holds the decoded content of the payload data. Alternatives: `.data`, `.payload`. | ||
- **Size attribute**: `<attribute>.size`. Holds the number of bytes of the encoded payload data. Alternatives: `.length`, `.bytes`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we use protocol buffers, the size/length of content is implicitly recorded.
I believe even for JSON this is effectively true (you'll have a length/size available at runtime) so no need to encode it separately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's important to keep the size as a separate attribute, since we may have an actual different content size (for example if we truncated part of it).
- **Size attribute**: `<attribute>.size`. Holds the number of bytes of the encoded payload data. Alternatives: `.length`, `.bytes`. | ||
- **Encoding attribute**: `<attribute>.encoding`. Holds the original attribute encoding type. | ||
Predefined values should be declared (though users may decide using custom values as well). | ||
For example - `json`, `protobuf`, `avro`, `utf-8`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you going to allow "versioning" of these protocols?
What's the path for adding new (custom) encodings?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that this is only the encoding originally initially when sending the payload. The content structure inside the span should always be the same, regardless of the protocol. This is actually another reason why we should translate the payload into some common structure as proposed here.
#### `Payload` class | ||
|
||
We propose adding a class that will provide the related functionality for handling payload data. | ||
Both traces and logs could use instances of this class, which will assist sharing related functionality. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this was added in real tracing products, can you link to the APIs/SDKs they offered?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a bit different, since they just offered this instrumentation OOTB, so the API for the payload was hidden from the developer. See for example Epsagon's instrumentation for FastAPI (The exact structure of the data is seen here)
) | ||
``` | ||
|
||
### Payload attributes in Logs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to resolving this discussion as a priority.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But should it affect the attribute names for spans as proposed here?
|
||
Representing Null values in payload contents is required, as it part of JSON and many other formats | ||
like Protobuf and Avro. Therefore we propose supporting it by spans and logs attributes APIs (while | ||
in OTel proto schema it is already supported). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is this supported in OTel proto schema? Do you mean the "AnyValue" being allowed to be "emtpy"?
I'd like more details about where/how you need Null
support here. Are you trying to record payload attributes even if the payload is null? Could you just write an empty byte array for the Proto/Avro case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Hmm, this can work. Updated this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The point here is that we need to support JSON payloads like '{"x": null}'
and also null
, and similar examples in other formats.
If the payload is encoded with the "AnyValue" type, it can be achieved by using empty objects. Though it's still needed to be supported by the APIs.
In case we add a dedicated proto object, like the Value
suggested in the alternative, we should also make sure it supports Null values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd need a lot more justification on a new AnyValue
type in OTLP before I'd be convinced. I think we already support the JSON payloads you mention.
Capturing payloads using appropriate APIs could assist in specifying different limits in the future (either | ||
more are less strict than of 'regular' attributes), and mechanisms for shortening large payloads as well. | ||
|
||
### Handling sensitive data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's worth calling out that payload collection likely requires some kind of flag to signal that sensitive data collection is ok.
Specifically - I think this proposal needs some kind of notion on a per-trace basis that denotes whether sensitive collection is "allowed" for that trace.
E.g. you can imagine a system where I issue an RPC with a special flag denoting that collection of payloads (and other sensitive data) is ok, where this would be turned on.
You can also imagine production systems that require user-consent before this data could be collected in a o11y system. We should absolutely have a mechanism to check (dynamically, per-request / per-trace) whether this data can be collected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if this might be out of scope for this as it should be resolved on the API level of the different SDKs. Here I suggest something much simpler - if we ever to auto-instrument collection of payloads, the default should always be off
where users who decide to turn it on should know it will be turned on for everything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm also in favor of having this kind of flag regarding sensitive data collection. Though I don't think it should be a requirement for the changes proposed in this OTEP.
At this point, users already collect sensitive data using the existing APIs and attributes.
Would love to see that happen at Step CI 😉 |
Should the payload be a span event instead of a span attribute? Most backends have higher size limits for span events. |
Lumigo does this with span attributes, and the experience was that a truncation to 1024, 2048 or 4096 or power thereof was often good enough. (The complex bit was performing truncation that would keep validity of the data format, e.g., JSON.) Users have means via SDK configuration to change the attribute truncation threshold. I think setting this data as span events would be very unintuitive in terms of data model. I don't think we should use an unintuitive data model to work around limitations of backends. Moreover, for an important use-case like this, I can imagine backends makings exceptions for the specific keys we'd use. |
might be worthwhile to ping folks who did some things in this space, even if moved on since.
|
Continuing the work started by @ronyis at #219.
We're proposing semantic attributes for reporting payloads (of HTTP requests, DB queries, etc.) as part of OpenTelemetry. This should be optional for users who wish to use it and can provide consumers of OpenTelemetry with a standard option for getting payloads out of trace instrumentation.
Previous instrumentation libraries (pre-otel) have successfully implemented it so I think this is something that will be welcomed by the community.
Would love to get your feedback and work on getting this accepted.