-
Notifications
You must be signed in to change notification settings - Fork 887
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Define low/high-cardinality #2996
Comments
Thanks for creating the issue @lmolkova ! Some context on why I commented that in the PR: I often speak with previous colleagues I worked with while I was a "full-time back-end developer". I ask them to try OTel, tell me their pains and their general idea of the spec and etc. One thing that always comes up is cardinality. None of them had much idea what it was and even worse, how they know the things they are instrumenting/recording are suffering from high cardinality. Plus, during the messaging SIG meetings, the topic of high-cardinality has come up multiple times, for ex where we discussed span names and what to use for it. I remember we going through the usual "can't use this because it's high-cardinality" and then immediately after, people asking but why not? Why/where is the problem with x approach? I thought about it and have some ideas, so I will just "dump" them here. What I thought would be either a complete new page for it or a section somewhere (e.g., glossary) with a structure like this: CardinalityGoals: Explain "Cardinality" in a general and "easy to grasp" way. For ex, I found this one for SQL well structured I would try to refrain from using complex, mathematical definitions as that doesn't help newcomers understand it. Why high-cardinality is a problem?Goals: Explain what having high cardinality will cause for users in the end. With clear and easy to understand examples. High-cardinality in tracesGoals: Explain with examples why it's a problem for traces High-cardinality in metricsGoals: Explain with examples why it's a problem for metrics How do I achieve low-cardinalityGoals: Here we can give best-practices on how to achieve this. For example, mentioning one should consider Curious to see what the community think about this. :) |
Related: open-telemetry/semantic-conventions#205 (comment) Low cardinality requirements apply to collection and storage, but query-time cardinality could also be important for user experience (for example, |
The TAG Observability white paper has definitions/explanations of metric cardinality https://github.com/cncf/tag-observability/blob/whitepaper-v1.0.0/whitepaper.md#metric-cardinality. Maybe we could borrow things from there, to finally fix this? |
What are you trying to achieve?
Currently, we recommend using low-cardinality span names in all trace conventions.
It would be great to have a definition of cardinality and the idea of what low and high mean so we can refer to it from different semantic conventions.
Additional context.
It's partially explained today in metrics supplemental guidelines and trace API
Originally posted by @joaopgrassi in #2957 (comment)
The text was updated successfully, but these errors were encountered: