-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[receiver/hostmetrics/process] "error reading username for process ... error reading parent pid for process ... (pid 1): invalid pid 0" #14311
Comments
Pinging code owners: @dmitryax. See Adding Labels via Comments if you do not have permissions to add labels yourself. |
Hi The fact that metrics per processes are missing when using Here an example of metrics that we are currently scraping.
This is not possible anymore on recent versions.
Thank you for your time. |
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
Yes, it does. The process scraper collects metrics per process running on the host, but, since you run the collector in a container, you have only one process. @MarioAlexis I'm trying to see what the problem is, and why you don't see metrics anymore. I see that you report different errors:
Any of them should not cause missing metrics, only resource attributes. Do you have any other reported errors by a chance? |
Hi @dmitryax You're right. I must have discarded a portion of the output while formating the text. I have updated the description of the issue with, this time, the good output. Sorry about that. I will update the issue title by adding This is still reproducible on version
I was thinking that maybe the error prompt by resource_to_telemetry_conversion:
enabled: false Setting that to But otelcol-contrib still logging this error
Even by having |
The logged error is expected because you run the collector in a container that has only one process without any parent process or username. This is not a typical usage of this scraper, this scraper is expected to be run on a Linux host collecting metrics from all other processes. What's your use case? Do you need to get CPU/memory metrics of the collector? In that case, you can use hostmetrics memory and cpu scrapers since your host = one collector process container. Also, the collector exposes its own metrics that also include cpu/memory AFAIRm that can be configured under |
Thank you to clarify the purpose of this scraper.
Is there another scrapper more suitable for container environment that will expose metrics per process ? We don't want Otelcol to log an error when isn't really an issue. Could be a warning log. |
If you want to scrape metrics about other processes on an underlying host, you can configure it to fetch host metrics from the otel container as described in https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/receiver/hostmetricsreceiver/README.md#collecting-host-metrics-from-inside-a-container-linux-only. Or you can use other k8s specific receivers e.g. https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/receiver/kubeletstatsreceiver |
Hi I'm more interest to gather metrics per process inside a container which have multiple process running(including otelcol) without having this "error reading username / reading parent pid (pid1)" error rather than look for host metrics. Is that something feasable? Is there any other receiver that is more suitable for that? I appreciate your time on this. Thank you. |
This is happening in 0.60.0, too. Nothing special with the config and is just the collector running as native process. I get lots of "error reading process name for pid #: readlink /proc/#/exe: no such file or directory" (where # is a given low numbered PID). |
Now, having added "mute_process_name_error: true" to the config I have "... unknown userid xxxxxx" messages instead, where xxxxxx is a number |
Heh. I've traced this all the way through the dependency tree. The underlying "os/user" package is failing a UID lookup by returning an error if the user does not exist in "/etc/passwd" file. Should all process scrapes fail because a UID can't be resolved? Raised as separate issue #17187 |
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
This issue has been closed as inactive because it has been stale for 120 days with no activity. |
…mute `error reading username for process` (open-telemetry#28661) **Description:** add configuration option `mute_process_user_error`) to mute `error reading username for process` **Link to tracking Issue:** * open-telemetry#14311 * open-telemetry#17187 Signed-off-by: Dominik Rosiek <drosiek@sumologic.com>
…mute `error reading username for process` (open-telemetry#28661) **Description:** add configuration option `mute_process_user_error`) to mute `error reading username for process` **Link to tracking Issue:** * open-telemetry#14311 * open-telemetry#17187 Signed-off-by: Dominik Rosiek <drosiek@sumologic.com>
What happened?
Description
Installing latest otel-collector version (0.60.0) receiver/hostmetrics/process complains about "invalid pid 0" on a CNF
environment. My lack of knowledge on GO does not allow me to determine if this call "parentPid, err := parentPid(handle, pid)"
block metrics exposure "per process" as documentation set out. Even if we set the parameter "mute_process_name_error" at
true.
Does the
receiver/hostmetrics/process
expose really metrics "per process"? I only can seeSteps to Reproduce
Expected Result
No Error output
Actual Result
Collector version
v0.60.0
OpenTelemetry Collector configuration
Log output
Additional context
No response
The text was updated successfully, but these errors were encountered: