Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Process semantic conventions are potentially unclear/inconsistent #129

Open
joaopgrassi opened this issue Jun 23, 2023 · 5 comments
Open
Assignees
Labels
invalid This doesn't seem right question Further information is requested

Comments

@joaopgrassi
Copy link
Member

joaopgrassi commented Jun 23, 2023

The PR "Add container semantic conventions resource" (#39) introduced the attributes container.command, container.command_line and container.command_args.

These attributes in the PR had a "note" saying

If using embedded credentials or sensitive data, it is recommended to remove them to prevent potential leakage

This fact triggered some discussions in the PR about the requirement level of such fields. It was pointed out that the process semconv define the same attributes, which are conditionally required, so the PR was adapted to follow it.

Upon looking at both things, I think we may have a few problems in the process semconv.

  1. I believe the concerns of such attributes having sensitive data is valid but in the process semconv file, nothing of the sort is mentioned
  2. The process semconv define those attributes are "conditionally required" but the "condition" is very vague and confusing - I marked them with A and B to be able to reference here
A) Additional attribute requirements: At least one of the following sets of attributes is required:

    process.executable.name
    process.executable.path
    process.command
    process.command_line
    process.command_args

B) Between `process.command_args` and `process.command_line`, usually `process.command_args` 
should be preferred. On Windows and other systems where the native
format of process commands is a single string, `process.command_line` can additionally (or instead) be used.

A): It lists all attributes and say at least one of them is required. Not sure if it's just me but I don't think this "condition" is good enough. What instrumentations should record? The first one it has? Is the list there in order of "priority" or "importance"

B:) It then gives actual directions, but only for process.command_args and process.command_line. The rest is still left without any guidance. Should they be added? Should they not?

In one of the discussions, it was brought up that they should probably be Opt-in (due to the potential sensitive data in such attributes), and I think I agree with this.

It would solve both the "sensitive information" issue (as users need to opt-in into them and they should know when it's safe or not) AND solves also the confusing "conditionally required" guidelines we have today. Basically all being opt-in allows the USER to pick which one they think it's more important/relevant for them.

CC @jsuereth @jamesmoessis @lmolkova @marcsanmi

@joaopgrassi joaopgrassi added invalid This doesn't seem right question Further information is requested labels Jun 23, 2023
@Oberon00
Copy link
Member

A): It lists all attributes and say at least one of them is required. Not sure if it's just me but I don't think this "condition" is good enough.

I think it is good enough and exactly as intended. You overlooked some attributes: There are some it does not list (e.g. process.pid). The intention here is to ensure we have some data about the executable, at least the base name of which will be in any of the listed attributes (not accounting for exotic cases where you override argv[0])

@Oberon00
Copy link
Member

The rest is still left without any guidance. Should they be added? Should they not?

This guidance would be a nice to have IMHO. I don't think the current conventions are inconsistent.

@joaopgrassi
Copy link
Member Author

joaopgrassi commented Jun 23, 2023

The intention here is to ensure we have some data about the executable, at least the base name of which will be in any of the listed attributes

Yes, that is clear, but shouldn't we have some sort of deterministic guidance on which one to report? I feel the way it is today, each instrumentation can report a different attribute and it's not going to be great. Like the guidance I called "B" is an example of something predictable.

@joaopgrassi
Copy link
Member Author

joaopgrassi commented Jun 23, 2023

I don't think the current conventions are inconsistent.

Sure, this one is not maybe just lacking a bit more guidance. But if the PR I linked is merged, we will have two command, command_args (Ok under different namespaces but still..) attributes that model pretty much the same thing but will be reported/recorded in different ways, with different guidelines. Maybe these should even be split out and shared like the HTTP ones?

@Oberon00
Copy link
Member

We are lacking the prefix concept from ECS, unfortunately, which could be helpful here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
invalid This doesn't seem right question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants