Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BREAKING: Fix server.address vs all other address attributes inconsistency #290

Conversation

Oberon00
Copy link
Member

@Oberon00 Oberon00 commented Aug 29, 2023

⚠️ (Likely superseded by #302)


Fixes #251

Many conventions reference the very recently introduced server.address to store the server domain name, or maybe actually the server IP? Who knows... Now you do again (well at least you could know, actually I always left the alternative open and did not check if it previously was peer/host.name/ip, i.e. tried to keep the relaxation done through the intermediate attribute renaming).

We have many address attributes, all of which contain the address (server.socket.address, client.address, client.socket.address, source.address, destination.address), only server.address stood out as also allowing a host name (aka domain). This PR fixes that inconsistency.

Changes

  • BREAKING: Split server.domain from server.address, no longer allow domain name
    in address.

Merge requirement checklist

@Oberon00 Oberon00 requested review from a team August 29, 2023 14:11
- ref: server.port
requirement_level:
conditionally_required: See below
constraints:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed this, since this was already meaningless since one of the any_of attributes was already "required".

@jsuereth
Copy link
Contributor

Is there anything this breaks that's not already behind the HTTP semconv migration flag?

@trask
Copy link
Member

trask commented Aug 31, 2023

In the realm of http semconv, the main downside as I see it to this change is that http client spans (and http client metrics) will no longer have a single attribute which represents the "best known server address", and now you always have to check two different attributes (server.address and server.domain`).

Even previously, net.peer.name was the "best known server address" attribute, and it could be either an ip address or domain name.

This change also differs from the ECS definition of server.address which can be either an ip address or domain name.

Could your underlying concern be addressed instead by pulling in server.domain and server.ip from ECS, and making those opt-in from HTTP semconv perspective? (e.g. opt-in to copying the value over from server.address to server.domain or server.ip as appropriate)

@Oberon00
Copy link
Member Author

Oberon00 commented Aug 31, 2023

In the realm of http semconv, the main downside as I see it to this change is that http client spans (and http client metrics) will no longer have a single attribute which represents the "best known server address", and now you always have to check two different attributes (server.address and server.domain`).

server.domain may contain an address, and I did not change any requirement levels. So server.domain will contain the best known server domain or address as fallback. The other way round (preferring the domain in address and allowing an address in domain at the same time even in combination) does not make sense to me naming-wise but technically would work the same.

Regarding server.ip: We have seen that using Unix domain sockets, especially in Node.js environments is not too uncommon. So when you have a Node.js client that talks to a local HTTP server via a unix domain socket as transport layer, you would have to store the logical hostname in the address and then the path to the socket file has no place anymore, because it neither fits into ip nor domain. Or it would effectively muddle the distinction so that address = preferred name or address, domain = alternative name or address whatever that is in each context.

What I propose:

Name's level ➡️ Logical Resolved Best
Connection's level in protocol stack ⬇️
High-level domain address¹ domain (logical always preferred)
Low-level socket.domain socket.address -

What we have now:

Name's level ➡️ Logical Resolved Best
Connection's level in protocol stack ⬇️
High-level mixed mixed, sometimes missing address (sometimes logical name is forbidden)
Low-level socket.domain socket.address -

Proposal with ip:

Name's level ➡️ Logical Resolved Best
Connection's level in protocol stack ⬇️
High-level domain¹ mixed, sometimes missing for non-IP address (sometimes logical name is forbidden)
Low-level socket.domain socket.address -

(¹: Attribute may be missing but the information contained in the other of resolved/logical)

@pyohannes
Copy link
Contributor

pyohannes commented Aug 31, 2023

server.domain may contain an address, and I did not change any requirement levels. So server.domain will contain the best known server domain or address as fallback. The other way round (preferring the domain in address and allowing an address in domain at the same time even in combination) does not make sense to me naming-wise but technically would work the same.

In my opinion, server.domain containing an IP address doesn't make sense either. An IP address is not a domain. Also, it seems a bit counter-intuitive that the "Unix domain socket name" can be put into server.address,

What @Oberon00 proposes:

server.address IP address, or unix socket
server.domain domain name, or IP address

The ECS approach, as mentioned by @trask:

server.address domain name, IP address, or unix socket
server.ip IP address
server.domain domain name

The latter one seems more consistent to me, and could be applied to server, client, destination, and source as a non-breaking change. For server.socket and client.socket, likely only address and ip make sense.

@Oberon00
Copy link
Member Author

For server.socket and client.socket, likely only address and ip make sense.

We already have domain there and it makes sense for situations where logical server/client is distinct from lower-level one (e.g. socket.domain would be set to the domain name of the configured HTTP proxy, and domain the logical HTTP server domain name)

@Oberon00
Copy link
Member Author

For server.socket and client.socket, likely only address and ip make sense.

I think it makes a little bit more sense than the other way round, as you can pass an IP to name resolution APIs and get it back as-is usually.

Also, it seems a bit counter-intuitive that the "Unix domain socket name" can be put into server.address,

This is already the current definition and it should be intuitive for anyone using domain sockets. The filepath is on the same level as the IP for IP-based communication. Take a look at the POSIX sockaddr types here: https://man7.org/linux/man-pages/man3/sockaddr.3type.html

  • An inet address consists of IP + port.
  • A unix domain socket address consists of the path.

@pyohannes
Copy link
Contributor

Also, it seems a bit counter-intuitive that the "Unix domain socket name" can be put into server.address,

This is already the current definition and it should be intuitive for anyone using domain sockets.

You're right, I'll strike that from my original argument. However, to me the ECS approach still seems preferable.

@Oberon00
Copy link
Member Author

As long as server.address is no longer different from all other address attributes, I'm mostly happy 😃

@trask
Copy link
Member

trask commented Sep 1, 2023

As long as server.address is no longer different from all other address attributes, I'm mostly happy 😃

got it, thanks

@lmolkova let's discuss when you're back next week, thx

@@ -59,6 +59,8 @@ release.
([#252](https://github.com/open-telemetry/semantic-conventions/pull/252))
- Simplify HTTP metric briefs.
([#276](https://github.com/open-telemetry/semantic-conventions/pull/276))
- BREAKING: Split server.domain from server.address, no longer allow domain name
in address.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
in address.
in address.
([#290](https://github.com/open-telemetry/semantic-conventions/pull/290))

@@ -63,6 +63,10 @@ versions:
- jvm.buffer.usage
- jvm.buffer.limit
- jvm.buffer.count
- rename_attributes:
attribute_map:
net.host.name: server.domain # This is re-renamed from the 1.21.0 renaming to server.address
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this is still a WIP? Also need to add the change to the affected metrics.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not intentionally a WIP. Any way to check the syntax of this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check the validity you mean? I'm not aware of any such tool tbh. But I'm referring that we need to add attributes renaming for metrics https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/schemas/file_format_v1.1.0.md#rename_attributes-transformation-2.

The problem is that it's not really a rename, but a "split".. doesn't seem we have such a concept for attributes alone.

Also, if I got it right net.host.name was renamed in 1.21, so we already have that entry in the schema file. Does it make sense to have it it again? It's also not correct that server.address: server.domain because we have both now.. Not sure how to handle this. @tigrannajaryan maybe knows more?

@lmolkova
Copy link
Contributor

lmolkova commented Sep 5, 2023

As long as server.address is no longer different from all other address attributes, I'm mostly happy 😃

is there anything beyond symmetry that's driving this change? What's the benefit here?

Copy link
Contributor

@lmolkova lmolkova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this change we have server.address and server.socket.address both representing server IP and it's not clear which one would be used.

Clients would know at most one IP/socket address and won't need two attributes.
HTTP servers report server.address as opt-in and would be better using resource attributes anyway.

So I see how this introduces confusion and duplication, but most importantly, I don't understand the problem we're trying to solve here (asymmetry arguably is not a problem on it's own and is expected since client-server communication is asymmetrical)

@Oberon00
Copy link
Member Author

Oberon00 commented Sep 6, 2023

I'm fine with replacing this by #302. Keeping it open for now, it should be closed by merge of #302.

@arminru arminru closed this in #302 Sep 12, 2023
@Oberon00 Oberon00 deleted the server-addr-consistency branch September 12, 2023 16:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Should domain name be allowed in client.address?
6 participants