Add conventions for attribute names #807

tigrannajaryan · 2020-08-14T19:32:15Z

Resolves: #726

SergeyKanzhelev · 2020-08-14T19:36:51Z

specification/common/common.md

+
+### Attribute Naming
+
+Attribute name (also known as the "attribute key") MUST be a valid Unicode


let's limit to printable characters and forbid control ones?

We are opening a different can of worms here, which is #504

@SergeyKanzhelev I wanted to place the minimal restrictions here, then refine them slightly differently in the 2 sub-sections below (and for semantic conventions the limitations are precisely as you listed - only printable ASCII characters). I can remove this sentence since it is somewhat superfluous given additional restrictions listed later in the document.

@Oberon00 I'd like to avoid addressing #504 in this same PR. They are related but don't need to be solved at the same time.

I agree, @tigrannajaryan. I think the restriction of being valid Unicode is OK for a start although maybe even that should be dropped from the PR to not get lost in that discussion.

specification/common/common.md

SergeyKanzhelev · 2020-08-14T19:42:54Z

specification/common/common.md

+
+- Semantic conventions MUST limit attribute names to
+  [U+0021 .. U+007E](https://en.wikipedia.org/wiki/Basic_Latin_(Unicode_block)#Table_of_characters)
+  subset of Unicode code points only. It is recommended to further limit


what is unicode code points? Would be nice to have human readable text here

I think "the ASCII subset of Unicode" would be more clear. Not sure if it means exactly the same?

@SergeyKanzhelev Code point is an official term used by Unicode: https://www.unicode.org/versions/Unicode13.0.0/ch02.pdf#G25564
I could use "character" but "code point" is the correct term in this context when referring to Unicode.

@Oberon00 ASCII is a wider range: [U+0000 .. U+007F]. That is not my intent. I want to exclude space and control characters, that's why a reference to that particular range.

@SergeyKanzhelev I reworded this sentence slightly, please see if it is clearer now.

this is so much better! Thank you

I know from some history that limiting to Latin is not very well recieved from globalization standpoint. But I think we need to state this as a recommendation as it avoids a slew of problems

@SergeyKanzhelev I think it is still the right thing to limit attribute names introduced by OpenTelemetry itself to Latin as it seems to be the common denominator in the programming world. For custom attributes that developers may use in their applications we only recommend Latin, but do not require it so people are still able to use attributes in any language supported by Unicode.

Oberon00 · 2020-08-14T19:53:46Z

specification/common/common.md

+- All attributes names that are part of OpenTelemetry semantic conventions
+  SHOULD be part of a namespace. 
+
+- When coming up with a new convention make sure to check existing namespaces


Are keys supposed to be unique across all three of these? #739 (comment) seems to not adhere to this principle.

That's a good question. I am inclined to say "yes", simply to avoid confusion that will arise if there are attibutes with the same name for a Resource and a Span which mean different things.

Created an issue #815

specification/common/common.md

anuraaga

Thanks! These definitely make it easier to come up with attribute names.

anuraaga · 2020-08-15T03:04:04Z

specification/common/common.md

+
+- The attribute is specific to your application. It is recommended to prefix the
+  attribute name by your application name, provided that the application name is
+  resonably unique and is not a generic enough word (e.g.


Is it important for the app name to be generally unique? If a company has tracing infrastructure shared by many teams, presumably the name just has to not collide with other teams in the company.

It needs to be unique within the scope where it is intended to be used, unique within the set of applications that can be used together.

Hopefully "reasonably unique" reflects the intent, if not I can clarify.

Reasonably unique by itself I think is fine, the following "generic enough word" is where it seems to overspecify to me. If a generic word is unique in the context of that user, that's probably fine.

anuraaga · 2020-08-15T03:06:24Z

specification/common/common.md

+  resonably unique and is not a generic enough word (e.g.
+  `myuniquemapapp.longitude` is likely fine). Make sure the application name
+  does not clash with an existing semantic convention namespace. If in doubt
+  prefix it with your company's reverse domain name.


I think it's rare for traces to have conflicts with traces from other companies, but there may be multiple teams at a company using the same tracing system. Can a company reverse domain name ever help disambiguating attributes?

I was thinking about the scenario where a popular application is created by a company and that application may be used outside the company as well. You are right that for purely internal applications this is likely not necessary.

Note that you may also be applying this guidelines for libraries. Even if it's a public OSS library, it may have very specific attributes that do not warrant standardizing.

From what I understand, this section is about apps, the end users, not libraries which are closer to instrumentation. It is confusing, perhaps there needs to be a section about libraries specifically, perhaps "OpenTelemetry authors" can be made more general to cover general libraries / instrumentation. But I would encourage a general stance of giving end users, presumably the final app developers, more freedom since their UX is what matters in the end. For example, if most of their queries require latitude anyways, there doesn't seem to be a big reason to prevent them from using the latitude attribute by itself unprefixed, since it's what they need. "Do what makes your job easier" seems to be missing from this section right now IMO.

Reworded, PTAL.

Reworded one more time to account for @anuraaga 's comment on giving more freedom.

anuraaga · 2020-08-15T03:09:02Z

specification/common/common.md

+  words by underscores (i.e. use snake_case). For example `http.status_code`
+  denotes the status code in the http namespace.
+
+- Attribute names SHOULD NOT coincide with namespaces. For example if


An example just came up which may describe this even more clearly. You can't have attribute service.version and a clarifying attribute service.version.type it needs to be something like service.version.id.

anuraaga

Thanks!

Oberon00

LGTM but I think #807 (comment) could be controversial.

Oberon00 · 2020-08-17T15:16:36Z

specification/common/common.md

+  [Resources](https://github.com/open-telemetry/opentelemetry-specification/tree/master/specification/resource/semantic_conventions),
+  [Spans](https://github.com/open-telemetry/opentelemetry-specification/tree/master/specification/trace/semantic_conventions),
+  and
+  [Metrics](https://github.com/open-telemetry/opentelemetry-specification/tree/master/specification/metrics/semantic_conventions)


Maybe explicitly state somewhere that label keys are equivalent to attribute keys in the scope of this document.

Submitted #821

We earlier defined naming rules for attributes, however we do not have similar rules for metric labels. This commit extends the exact same set of rules to metric labels. This was brought up in this comment open-telemetry#807 (comment)

garretwilson · 2020-08-19T16:32:10Z

As the person who raised #726, I'm really happy that some work has been done to clarify this area a bit.

It's a little disappointing that it seems that you're trying to invent some new system on the fly however. Experts in the field have spent decades refining naming systems. For example, the ECMAScript (JavaScript) specification already goes into depth on identifier syntax; this has been developed for years and tested extensively in the field. Was this specification or any other specification consulted?

The discussion and resulting prose indicate some confusion on the fundamentals of Unicode and text representation. For example, limiting of attribute names indicates "code points" when it appears "Unicode Category" was intended. (If you want a crash course on Unicode you might be interested in my lesson on the subject.) And even here the definition is far from rigorous: can I start a name with a number? Can I start a name with an underscore? Can I start a name with a dot??!

That you are essentially limiting names to a subset of ASCII is concerning, placing your specification back around the mid 1990s in its level of syntax sophistication and international support. Again I think it would have been helpful to consult similar metadata and naming specifications, where experts have pored over such details for many years.

Lastly the use of a dot . as a namespace delimiter can cause problems. My guess is that you were following the lead of the Java package delimiter, but the dot is so overloaded that other naming systems use other delimiters. For example XML/HTML/RDF CURIE uses : as a namespace prefix delimiter, and HTML5 with its data- attributes seems to be leaning towards the dash - character as a type of namespace delimiter. Many systems do not allow dots in tokens (what if the tracing library logs to a monitoring system that queries for variable names?), and in some systems the dot is a special character indicating class-property relationship (such as Widget.id).

Let me give you an example using a query language. Configuration files often use the form foo.bar to look up resource keys in Java Properties files or YAML files for instance. The dot represents a level indicator between two tokens, such as in OGNL. (See a hierarchical configuration library as another example.) So for example if I were to have an outer span named outer with a child span named inner, I might want to query for that using outer.inner. But if each of those spans uses . for namespacing such as foo.outer and bar.inner, querying for inner using a dot with foo.outer.bar.inner would be ambiguous. I'm just saying that maybe a dot is not the best choice of a namespace delimiter within a token, based upon common usage of delimiters in modern naming and querying systems.

In any case this is certainly a great step forward and I hope you to keep improving it. I would encourage you look into existing work in the field and not just try to invent everything from scratch in your tiny area of technology, because distributed tracing actually has the potential to interact with many naming and metadata systems across applications.

tigrannajaryan · 2020-08-19T17:17:52Z

@garretwilson thank you for the comment. Further improvements, especially in the form of PRs are welcome. My PR primarily legalized implicit rules that semantic conventions were already following.

* Extend attribute naming rules to metric labels We earlier defined naming rules for attributes, however we do not have similar rules for metric labels. This commit extends the exact same set of rules to metric labels. This was brought up in this comment #807 (comment) * Address PR comments

tedsuo · 2020-08-19T18:12:17Z

Hi @garretwilson, thank you for the feedback!

The dot notation for key names has been a common convention in distributed tracing for years at this point, so unfortunately it is probably here to stay for the foreseeable future (we care a lot about backwards compatibility with existing instrumentation, so dots in the keys will be a gotcha for many years). If there is a better way to do this, we would want it done before GA . But please accept that we may just have to live with dots, albeit hopefully in a more structured and organized fashion. 🙏

Something we are actively interested is adopting/merging with is the Elastic Common Schema. If we could clean up our information architecture to match their conventions and existing definitions, that would be really helpful. (@tigrannajaryan please correct me if something has changed on this front in the Logging SIG). @garretwilson is that something you would be interested in taking on?

* Extend attribute naming rules to metric labels We earlier defined naming rules for attributes, however we do not have similar rules for metric labels. This commit extends the exact same set of rules to metric labels. This was brought up in this comment open-telemetry/opentelemetry-specification#807 (comment) * Address PR comments

* Add conventions for attribute names Resolves: open-telemetry#726 * Address PR comments * Re-word company/application specific attribute recommendations

Add conventions for attribute names

276a395

Resolves: open-telemetry#726

tigrannajaryan requested review from a team and SergeyKanzhelev August 14, 2020 19:32

tigrannajaryan added the spec:miscellaneous For issues that don't match any other spec label label Aug 14, 2020

SergeyKanzhelev reviewed Aug 14, 2020

View reviewed changes

specification/common/common.md Outdated Show resolved Hide resolved

SergeyKanzhelev reviewed Aug 14, 2020

View reviewed changes

specification/common/common.md Outdated Show resolved Hide resolved

SergeyKanzhelev reviewed Aug 14, 2020

View reviewed changes

SergeyKanzhelev approved these changes Aug 14, 2020

View reviewed changes

SergeyKanzhelev self-assigned this Aug 14, 2020

Oberon00 reviewed Aug 14, 2020

View reviewed changes

specification/common/common.md Outdated Show resolved Hide resolved

tigrannajaryan force-pushed the feature/tigran/attribute-names branch from e8e6066 to 1d9f7a4 Compare August 14, 2020 23:08

tigrannajaryan requested a review from a team August 14, 2020 23:08

tigrannajaryan force-pushed the feature/tigran/attribute-names branch from 1d9f7a4 to f014f61 Compare August 14, 2020 23:19

anuraaga approved these changes Aug 15, 2020

View reviewed changes

anuraaga mentioned this pull request Aug 16, 2020

conventions for names such as attribute names #726

Closed

Oberon00 added the area:semantic-conventions Related to semantic conventions label Aug 17, 2020

This was referenced Aug 17, 2020

Semantic conventions for Dependency Injection #813

Open

Add semantic conventions for HTTP metrics #739

Merged

Address PR comments

64d8449

tigrannajaryan force-pushed the feature/tigran/attribute-names branch from f014f61 to 64d8449 Compare August 17, 2020 14:42

anuraaga approved these changes Aug 17, 2020

View reviewed changes

Re-word company/application specific attribute recommendations

d26c821

tigrannajaryan force-pushed the feature/tigran/attribute-names branch from 6ccdd82 to d26c821 Compare August 17, 2020 15:09

Oberon00 approved these changes Aug 17, 2020

View reviewed changes

Oberon00 reviewed Aug 17, 2020

View reviewed changes

SergeyKanzhelev merged commit 671f5ab into open-telemetry:master Aug 17, 2020

tigrannajaryan deleted the feature/tigran/attribute-names branch August 17, 2020 17:27

This was referenced Aug 17, 2020

Add semantic conventions for thread.name and thread.id span attributes #789

Merged

Decide if semantic conventions require that attribute names across all areas should be unique #815

Closed

tigrannajaryan mentioned this pull request Aug 17, 2020

Extend attribute naming rules to metric labels #821

Merged

jmacd mentioned this pull request Aug 19, 2020

Metrics API: instrument name and label name guidelines and limitations #387

Closed

tigrannajaryan mentioned this pull request May 19, 2023

Clarification: metric namespaces are not allowed to be metric names #3476

Closed


		### Attribute Naming

		Attribute name (also known as the "attribute key") MUST be a valid Unicode

Add conventions for attribute names #807

Add conventions for attribute names #807

Conversation

tigrannajaryan commented Aug 14, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Oberon00 Aug 14, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

anuraaga left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tigrannajaryan Aug 17, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

anuraaga left a comment

Choose a reason for hiding this comment

Oberon00 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

garretwilson commented Aug 19, 2020 • edited Loading

tigrannajaryan commented Aug 19, 2020

tedsuo commented Aug 19, 2020 • edited Loading

Oberon00 Aug 14, 2020 •

edited

Loading

tigrannajaryan Aug 17, 2020 •

edited

Loading

garretwilson commented Aug 19, 2020 •

edited

Loading

tedsuo commented Aug 19, 2020 •

edited

Loading