Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Propose using a different schema to represent Events in a span #37028

Open
awangc opened this issue Jan 6, 2025 · 5 comments
Open

Propose using a different schema to represent Events in a span #37028

awangc opened this issue Jan 6, 2025 · 5 comments
Labels
enhancement New feature or request exporter/elasticsearch needs triage New item requiring triage

Comments

@awangc
Copy link

awangc commented Jan 6, 2025

Component(s)

exporter/elasticsearch

Is your feature request related to a problem? Please describe.

When storing Span Events in elasticsearch, the event name becomes the key in the default mapping mode, under which different attributes are stored, e.g. if we have events with name "my-event-1", "my-event-2", then in Elasticsearch we'll have Events.my-event-1.time, Events.my-event-2.time, etc. This does not seem to follow the data format for events for a Span from opentelemetry collector, which are modeled as an array of Span_Event, in which a Span_Event will contain fields like time, name and array of attributes.

The issue I see with this approach is that if name is given arbitrary values (e.g. random UUIDs), then we could see an arbitrary increase in the number of keys, leading to mapping field explosion in Elasticsearch.

Describe the solution you'd like

Store span events as an array in elasticsearch, in which each element is an object with fields with time, name and array of attribute (with dropped attribute counts as another possible field - like the Span_Event class)

Admittedly this format may require nested objects which may bring its own performance issues, but it resembles more the data layout from opentelemetry pdata.

Describe alternatives you've considered

No response

Additional context

The schema proposed above would follow the same format for spans, e.g., we have Span.Name and Span.Attributes, and we'd have Event.Name and Event.Attributes, and more closely represents the Event as defined in opentelemetry

@awangc awangc added enhancement New feature or request needs triage New item requiring triage labels Jan 6, 2025
Copy link
Contributor

github-actions bot commented Jan 6, 2025

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@gregkalapos
Copy link
Member

gregkalapos commented Jan 7, 2025

Hey 👏

with the default mapping mode (which is the mode none), this is indeed the case, but for OTel traces, I'm not sure how useful that mode is at all. If you look at how data is stored, most things won't follow the shape of the data suggested by OTel - e.g. resource attributes are also stored under Resource.* directly and Kibana is not showing those traces either on the APM UI and the service isn't visible anywhere under Observability - and that's because the none mapping mode.

So for traces, the otel mapping mode would be the way to go.

For that mapping mode there was already a lot of thinking on how data is stored.

When it comes to how OTel data is stored in Elasticsearch, with the otel mapping mode, we have 3 main data streams where data ends up - 1) traces 2) logs, and 3) metrics.

All I'm saying here is meant for the otel mapping mode.

The idea is that events are modelled as log records, therefore they'll end up in the logs data stream - that's very natural for log events, may not be for span events, but I think so far, the idea is that all events will end up in the same data stream. You can of course connect back those span events to the original span via the span id.

This is aligned with what's stated here in the docs.

It's interesting to see that in OTLP, there is a specific type for span events which is totally different from log events.

In any case - with the otel mapping mode, there should not be a cardinality problem, because the event name is a top level field and other fields - e.g. attributes are not embedded under the event name in that case.

So, I'd say the otel mapping mode is the way to go.

Having said that, a few throughs on the default mapping mode (none):

The issue I see with this approach is that if name is given arbitrary values (e.g. random UUIDs), then we could see an arbitrary increase in the number of keys, leading to mapping field explosion in Elasticsearch.

That's technically clearly possible, and would be indeed an issue. Honestly I'd have expected to spec to say something about event name cardinality, all I find is this part here:

It is also recommended that the event names have low-cardinality, so care must be taken to use fields that identify the class of Events but not the instance of the Event.

With that, I'd say, if the user follows the spec, there should not be cardinality explosion. On that other hand, the events API easily allows this, so I think you raise a good point here.

@awangc
Copy link
Author

awangc commented Jan 13, 2025

@gregkalapos it seems the otel mapping mode constructs indices based on different attributes, and span events are sent to a logs- index. Is there a way to change this integration? Like having the elasticsearch.index.prefix attributes that can change the prefix for the trace index? This is so that we can group all related data with the same prefix.

EDIT: I removed a section of this comment that said I could not get this mapping mode to work, that was because I was on an old version of Elasticsearch. I'm currently on 8.15.5 and it is working.

@carsonip
Copy link
Contributor

carsonip commented Jan 15, 2025

Looking at the code, elasticsearch.index.prefix should work, but documents will then be sent to a fallback index name which is a concatenation of prefix, configured index name, and suffix.

OTel mode is designed to send to ES 8.16+, and with the default setup, a lot of the curated UI in Kibana e.g. APM UI will work out of the box. Changing the index name will certainly break that functionality.

Hence my question, what's the use case of sending span events to have a different data stream type? Do you usually query the documents using an API, or in Discover? How does sending span events to a logs- data stream as opposed to something else affect your workflow?

Valid data stream types in ECS are logs, metrics and traces. Since span events aren't metrics and traces, they could only be logs, so that they work well in APM UI out of the box.

@awangc
Copy link
Author

awangc commented Jan 21, 2025

@carsonip Actually, I'm trying a custom solution that is not based on Elastic APM 😄 , thus my desire to be able to configure the index prefix. In our logging solution we do not use logs- prefix, but have our custom prefix, partly so that we don't interfere with any solutions predefined by Elastic.

If otel mode is set to support APM, how about changing the schema of default mode for Span Events? Since these are currently stored in the index where traces are being stored?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request exporter/elasticsearch needs triage New item requiring triage
Projects
None yet
Development

No branches or pull requests

3 participants