-
Notifications
You must be signed in to change notification settings - Fork 175
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add more ECS url fields #181
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -30,6 +30,15 @@ This document defines semantic conventions that describe URL and its components. | |
| `url.path` | string | The [URI path](https://www.rfc-editor.org/rfc/rfc3986#section-3.3) component [2] | `/search` | Recommended | | ||
| `url.query` | string | The [URI query](https://www.rfc-editor.org/rfc/rfc3986#section-3.4) component [3] | `q=OpenTelemetry` | Recommended | | ||
| `url.fragment` | string | The [URI fragment](https://www.rfc-editor.org/rfc/rfc3986#section-3.5) component | `SemConv` | Recommended | | ||
| `url.registered_domain` | string | The highest registered url domain, stripped of the subdomain. [4] | `example.com` | Opt-In | | ||
| `url.subdomain` | string | The subdomain portion of a fully qualified domain name includes all of the names except the host name under the registered_domain. In a partially qualified domain, or if the the qualification level of the full name cannot be determined, subdomain contains all of the names below the registered domain. [5] | `east` | Opt-In | | ||
| `url.top_level_domain` | string | The effective top level domain (eTLD), also known as the domain suffix, is the last part of the domain name. For example, the top level domain for example.com is "com". [6] | `co.uk` | Opt-In | | ||
| `url.username` | string | Username of the request. | `user42` | Opt-In | | ||
| `url.password` | string | Password of the request. | `changeme` | Opt-In | | ||
| `url.extension` | string | The field contains the file extension from the original request url, excluding the leading dot. [7] | `png` | Opt-In | | ||
| `url.domain` | string | Domain of the url, such as `www.opentelemetry.io`. [8] | `www.opentelemetry.io` | Opt-In | | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is redundant with server.address (or server.domain as proposed in #290). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @Oberon00 Thanks for the review! I'd like to use your comment to address two aspects here: one that is directly related to this proposal here and one general aspect. This
|
||
| `url.port` | int | Port of the request | `9090` | Opt-In | | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is redundant with server.port. |
||
| `url.original` | string | Unmodified original URL as seen in the event source. [9] | `https://www.opentelemetry.io/search/?q=container` | Opt-In | | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It is unclear how url.full should be modified. |
||
|
||
**[1]:** For network calls, URL usually has `scheme://host[:port][path][?query][#fragment]` format, where the fragment is not transmitted over HTTP, but if it is known, it should be included nevertheless. | ||
`url.full` MUST NOT contain credentials passed via URL in form of `https://username:password@www.example.com/`. In such case username and password should be redacted and attribute's value should be `https://REDACTED:REDACTED@www.example.com/`. | ||
|
@@ -38,6 +47,23 @@ This document defines semantic conventions that describe URL and its components. | |
**[2]:** When missing, the value is assumed to be `/` | ||
|
||
**[3]:** Sensitive content provided in query string SHOULD be scrubbed when instrumentations can identify it. | ||
|
||
**[4]:** For example, the registered domain for "foo.example.com" is "example.com". | ||
This value can be determined precisely with a list like the public suffix list (`http://publicsuffix.org`). Trying to approximate this by simply taking the last two labels will not work well for TLDs such as "co.uk". | ||
|
||
**[5]:** For example the subdomain portion of `www.east.mydomain.co.uk` is "east". If the domain has multiple levels of subdomain, such as `sub2.sub1.example.com`, the subdomain field should contain "sub2.sub1", with no trailing period. | ||
|
||
**[6]:** This value can be determined precisely with a list like the public suffix list (`http://publicsuffix.org`). Trying to approximate this by simply taking the last label will not work well for effective TLDs such as `co.uk`. | ||
|
||
**[7]:** The file extension is only set if it exists, as not every url has a file extension. | ||
The leading period must not be included. For example, the value must be "png", not ".png". | ||
Note that when the file name has multiple extensions (example.tar.gz), only the last one should be captured ("gz", not "tar.gz"). | ||
|
||
**[8]:** In some cases a URL may refer to an IP and/or port directly, without a domain name. In this case, the IP address would go to the domain field. | ||
If the URL contains a literal IPv6 address enclosed by [ and ] (IETF RFC 2732), the [ and ] characters should also be captured in the domain field. | ||
|
||
**[9]:** Note that in network monitoring, the observed URL may be a full URL, whereas in access logs, the URL is often just represented as a path. | ||
This field is meant to represent the URL as it was observed, complete or not. | ||
<!-- endsemconv --> | ||
|
||
## Sensitive information | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -37,3 +37,93 @@ groups: | |
type: string | ||
brief: 'The [URI fragment](https://www.rfc-editor.org/rfc/rfc3986#section-3.5) component' | ||
examples: ["SemConv"] | ||
- id: registered_domain | ||
requirement_level: opt_in | ||
type: string | ||
brief: > | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Even with that the table was kinda broken. I moved some content into the |
||
The highest registered url domain, stripped of the subdomain. | ||
note: > | ||
For example, the registered domain for "foo.example.com" is "example.com". | ||
|
||
This value can be determined precisely with a list like the public suffix | ||
list (`http://publicsuffix.org`). Trying to approximate this by simply taking | ||
the last two labels will not work well for TLDs such as "co.uk". | ||
examples: [ "example.com" ] | ||
- id: subdomain | ||
requirement_level: opt_in | ||
type: string | ||
brief: > | ||
The subdomain portion of a fully qualified domain name includes all of | ||
the names except the host name under the registered_domain. In a partially | ||
qualified domain, or if the the qualification level of the full name cannot | ||
be determined, subdomain contains all of the names below the registered domain. | ||
note: > | ||
For example the subdomain portion of `www.east.mydomain.co.uk` is "east". | ||
If the domain has multiple levels of subdomain, such as `sub2.sub1.example.com`, | ||
the subdomain field should contain "sub2.sub1", with no trailing period. | ||
examples: [ "east" ] | ||
- id: top_level_domain | ||
requirement_level: opt_in | ||
type: string | ||
brief: > | ||
The effective top level domain (eTLD), also known as the domain suffix, | ||
is the last part of the domain name. For example, the top level domain | ||
for example.com is "com". | ||
note: > | ||
This value can be determined precisely with a list like the public suffix list | ||
(`http://publicsuffix.org`). Trying to approximate this by simply taking the last | ||
label will not work well for effective TLDs such as `co.uk`. | ||
examples: [ "co.uk" ] | ||
- id: username | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should we add There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hmm, I was not aware of those being deprecated. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Instrumentations also cover older http clients which would still contain |
||
requirement_level: opt_in | ||
type: string | ||
brief: Username of the request. | ||
examples: [ "user42" ] | ||
- id: password | ||
requirement_level: opt_in | ||
type: string | ||
brief: Password of the request. | ||
examples: [ "changeme" ] | ||
- id: extension | ||
requirement_level: opt_in | ||
type: string | ||
brief: > | ||
The field contains the file extension from the original request url, | ||
excluding the leading dot. | ||
note: > | ||
The file extension is only set if it exists, as not every url has | ||
a file extension. | ||
|
||
The leading period must not be included. For example, the value must | ||
be "png", not ".png". | ||
|
||
Note that when the file name has multiple extensions (example.tar.gz), | ||
only the last one should be captured ("gz", not "tar.gz"). | ||
examples: [ "png" ] | ||
- id: domain | ||
requirement_level: opt_in | ||
type: string | ||
brief: > | ||
Domain of the url, such as `www.opentelemetry.io`. | ||
note: > | ||
In some cases a URL may refer to an IP and/or port directly, | ||
without a domain name. In this case, the IP address would go to the domain field. | ||
|
||
If the URL contains a literal IPv6 address enclosed by [ and ] (IETF RFC 2732), | ||
the [ and ] characters should also be captured in the domain field. | ||
examples: [ "www.opentelemetry.io" ] | ||
- id: port | ||
requirement_level: opt_in | ||
type: int | ||
brief: Port of the request | ||
examples: [ 9090 ] | ||
- id: original | ||
requirement_level: opt_in | ||
type: string | ||
brief: Unmodified original URL as seen in the event source. | ||
note: > | ||
Note that in network monitoring, the observed URL may be | ||
a full URL, whereas in access logs, the URL is often just represented as a path. | ||
|
||
This field is meant to represent the URL as it was observed, complete or not. | ||
examples: [ "https://www.opentelemetry.io/search/?q=container" ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should definitely add some caveat here about sensitive information :)
I forget how we decided to call this out or if that's still TBD, but I'd love a not on these fields about sensitivity.