Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

process.uptime otel config feature failing on linux machine #31107

Closed
krantishetty opened this issue Feb 7, 2024 · 19 comments
Closed

process.uptime otel config feature failing on linux machine #31107

krantishetty opened this issue Feb 7, 2024 · 19 comments

Comments

@krantishetty
Copy link

Component(s)

processor/metricsgeneration

Describe the issue you're reporting

We try to add the uptime metrics value under process, but its failing. could you please help to advise on this.

my config file:
process:
metrics:
process.uptime:
enabled: true

@krantishetty krantishetty added the needs triage New item requiring triage label Feb 7, 2024
@github-actions github-actions bot added the processor/metricsgeneration Metrics Generation processor label Feb 7, 2024
Copy link
Contributor

github-actions bot commented Feb 7, 2024

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@crobert-1 crobert-1 added the question Further information is requested label Feb 7, 2024
@krantishetty
Copy link
Author

process:
metrics:
process.uptime:
enabled: true

error details

'metrics' has invalid keys: process.uptime. collector server run finished with error: failed to get config: cannot unmarshal the confiuration: 1 error(s) decoding:

@krantishetty
Copy link
Author

process-uptime

@krantishetty
Copy link
Author

Can you help on this

@krantishetty
Copy link
Author

My goal to config add process attributes in otel configuration for linux to find the top high process usage with pid. How I can configure this using with attributes, please advise

@crobert-1
Copy link
Member

Hello @krantishetty, can you share the full contents of your configuration file? Also, can you share the full error output?

A good place to start may be to use the hostmetrics receiver, specifically the process source.

@crobert-1 crobert-1 added receiver/hostmetrics and removed processor/metricsgeneration Metrics Generation processor labels Feb 9, 2024
Copy link
Contributor

github-actions bot commented Feb 9, 2024

Pinging code owners for receiver/hostmetrics: @dmitryax @braydonk. See Adding Labels via Comments if you do not have permissions to add labels yourself.

@krantishetty
Copy link
Author

receivers:
hostmetrics:
collection_interval: 10s
scrapers:
cpu:
metrics:
system.cpu.utilization:
enabled: true
system.cpu.time:
enabled: true
system.cpu.physical.count:
enabled: true
system.cpu.logical.count:
enabled: true

  process:
    metrics:
      process.cpu.utilization:
      enabled: true
      process.cpu.time:
      enabled: true
      process.disk.io:
      enabled: true
      process.memory.utilization:
      enabled: true
      process.disk.operations:
      enabled: true
      process.uptime:
      enabled: true
      process.pid:
      enabled: true

processors:
memory_limiter:
check_interval: 1s
limit_mib: 10000
spike_limit_mib: 200

resourcedetection:
detectors: [env, system]
system:
hostname_sources: ["os"]
resource_attributes:
host.id:
enabled: true
host.name:
enabled: true

exporter:
otlphttp:
endpoint: https://xxxxxxx/
headers:
Authorization: "xxxxxxxx"
logging:
loglevel: debug

service:
pipelines:
metrics:
receivers: [hostmetrics]
processors: [memory_limiter, resourcedetection]
exporters: [logging, otlhttp]

Error: failed to get config: cannot unmarshal the configuration: 1 errors(s) decoding:

  • error decoding 'receivers': error reading configuration for "hostmetrics": error reading settings for scraper type "process": 1 error(s) decoding:

  • 'metrics' has invalid keys: process.pid, process.uptime

otel version is .88

@krantishetty
Copy link
Author

otel-1
otel-2

@krantishetty
Copy link
Author

Please advise

@krantishetty
Copy link
Author

Hi @crobert-1 , please advise my query

@krantishetty
Copy link
Author

Hi @crobert-1 , can you please advise

@krantishetty
Copy link
Author

@tmc @indrekj @dazuma can someone advise my query

@braydonk
Copy link
Contributor

braydonk commented Feb 19, 2024

It is not good etiquette to tag people who have nothing to do with the issue. I would advise against doing that in the future.

process.uptime is not a supported metric. You can see the full list at: https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/receiver/hostmetricsreceiver/internal/scraper/processscraper/metadata.yaml

@krantishetty
Copy link
Author

Apologies, we are barely get the response and we are looking some help to get the system level process details and process owner when the high cpu memory usage triggered to find the exact process details which cause the problem

@braydonk
Copy link
Contributor

braydonk commented Feb 19, 2024

A couple things to note.

Firstly, GitHub Issues are intended for reporting problems with reproduction steps, not for getting support on usage. I would recommend instead joining the CNCF Slack and asking these sorts of questions in the #otel-collector channel. https://communityinviter.com/apps/cloud-native/cncf

Secondly, when posting configuration files it is strongly recommended that you use GitHub's Markdown support to post the yaml in code blocks within your comments. yaml is indentation dependent, so it is important that the indentation is maintained. You can follow those instructions here: https://docs.github.com/en/get-started/writing-on-github/working-with-advanced-formatting/creating-and-highlighting-code-blocks

I will guide you to getting a working config, after which point I think this issue can be considered closed (I don't have the ability to actually close it but someone who is able to may do so).


There are a few errors with the configuration you posted in the screenshot:

  • process.uptime is not a real supported metric.
  • process.pid is a Resource Attribute, not a metric. It goes under a resource_attributes setting, not metrics.
  • The indentation in the metrics section is off. enabled should be nested under the metric name.

So the process config you want is:

process:
  metrics:
    process.cpu.utilization:
      enabled: true
    process.cpu.time:
      enabled: true
    process.disk.io:
      enabled: true
    process.memory.utilization:
      enabled: true
    process.disk.operations:
      enabled: true
  resource_attributes:
    process.pid:
      enabled: true

A few things you have enabled are already enabled by default and can be removed, so you can simplify it further:

process:
  metrics:
    process.cpu.utilization:
      enabled: true
    process.memory.utilization:
      enabled: true
    process.disk.operations:
      enabled: true

With this process configuration you should have all the metrics you are looking for.

@krantishetty
Copy link
Author

Thank you very much @braydonk , i will test it with attributes config and will share it.

@crobert-1 crobert-1 removed the needs triage New item requiring triage label Mar 5, 2024
Copy link
Contributor

github-actions bot commented May 6, 2024

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label May 6, 2024
Copy link
Contributor

github-actions bot commented Jul 5, 2024

This issue has been closed as inactive because it has been stale for 120 days with no activity.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Jul 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants