Status | |
---|---|
Stability | development: logs |
beta: metrics | |
Distributions | core, contrib |
Issues | |
Code Owners | @dmitryax, @braydonk |
The Host Metrics receiver generates metrics about the host system scraped from various sources and host entity event as log. This is intended to be used when the collector is deployed as an agent.
The collection interval, root path, and the categories of metrics to be scraped can be configured:
hostmetrics:
collection_interval: <duration> # default = 1m
initial_delay: <duration> # default = 1s
root_path: <string>
scrapers:
<scraper1>:
<scraper2>:
...
The available scrapers are:
Scraper | Supported OSs | Description |
---|---|---|
cpu | All except Mac[1] | CPU utilization metrics |
disk | All except Mac[1] | Disk I/O metrics |
load | All | CPU load metrics |
filesystem | All | File System utilization metrics |
memory | All | Memory utilization metrics |
network | All | Network interface I/O metrics & TCP connection metrics |
paging | All | Paging/Swap space utilization and I/O metrics |
processes | Linux, Mac | Process count metrics |
process | Linux, Windows, Mac | Per process CPU, Memory, and Disk I/O metrics |
[1] Not supported on Mac when compiled without cgo which is the default.
Several scrapers support additional configuration:
disk:
<include|exclude>:
devices: [ <device name>, ... ]
match_type: <strict|regexp>
filesystem:
<include_devices|exclude_devices>:
devices: [ <device name>, ... ]
match_type: <strict|regexp>
<include_fs_types|exclude_fs_types>:
fs_types: [ <filesystem type>, ... ]
match_type: <strict|regexp>
<include_mount_points|exclude_mount_points>:
mount_points: [ <mount point>, ... ]
match_type: <strict|regexp>
cpu_average
specifies whether to divide the average load by the reported number of logical CPUs (default: false
).
load:
cpu_average: <false|true>
network:
<include|exclude>:
interfaces: [ <interface name>, ... ]
match_type: <strict|regexp>
process:
<include|exclude>:
names: [ <process name>, ... ]
match_type: <strict|regexp>
mute_process_name_error: <true|false>
mute_process_exe_error: <true|false>
mute_process_io_error: <true|false>
mute_process_user_error: <true|false>
mute_process_cgroup_error: <true|false>
scrape_process_delay: <time>
The following settings are optional:
mute_process_name_error
(default: false): mute the error encountered when trying to read a process name the collector does not have permission to readmute_process_io_error
(default: false): mute the error encountered when trying to read IO metrics of a process the collector does not have permission to readmute_process_cgroup_error
(default: false): mute the error encountered when trying to read the cgroup of a process the collector does not have permission to readmute_process_exe_error
(default: false): mute the error encountered when trying to read the executable path of a process the collector does not have permission to read (Linux only)mute_process_user_error
(default: false): mute the error encountered when trying to read a uid which doesn't exist on the system, eg. is owned by a user that only exists in a container.
If you are only interested in a subset of metrics from a particular source, it is recommended you use this receiver with the Filter Processor.
If you would like to scrape some metrics at a different frequency than others,
you can configure multiple hostmetrics
receivers with different
collection_interval
values. For example:
receivers:
hostmetrics:
collection_interval: 30s
scrapers:
cpu:
memory:
hostmetrics/disk:
collection_interval: 1m
scrapers:
disk:
filesystem:
service:
pipelines:
metrics:
receivers: [hostmetrics, hostmetrics/disk]
Host metrics are collected from the Linux system directories on the filesystem. You likely want to collect metrics about the host system and not the container. This is achievable by following these steps:
The simplest configuration is to mount the entire host filesystem when running
the container. e.g. docker run -v /:/hostfs ...
.
You can also choose which parts of the host filesystem to mount, if you know
exactly what you'll need. e.g. docker run -v /proc:/hostfs/proc
.
Configure root_path
so the hostmetrics receiver knows where the root filesystem is.
Note: if running multiple instances of the host metrics receiver, they must all have
the same root_path
.
Example:
receivers:
hostmetrics:
root_path: /hostfs
Currently, the hostmetrics receiver does not set any Resource attributes on the exported metrics. However, if you want to set Resource attributes, you can provide them via environment variables via the resourcedetection processor. For example, you can add the following resource attributes to adhere to Resource Semantic Conventions:
export OTEL_RESOURCE_ATTRIBUTES="service.name=<the name of your service>,service.namespace=<the namespace of your service>,service.instance.id=<uuid of the instance>"
Entity Events as logs are experimental and might eventually be replaced by the result of the OTEP. For now, the hostmetrics receiver can send the host entity event as a log records. By default, the hostmetrics receiver sends periodic EntityState events every 5 minutes. You can change that by setting metadata_collection_interval
. Entity Events as logs are experimental. The result of the OTEP might eventually replace that.
See the Collector feature gates for an overview of feature gates in the collector.
When enabled, normalizes the process.cpu.utilization
metric onto the interval [0-1] by dividing the value by the number of logical processors. With this feature gate disabled, the value of the process.cpu.utilization
metric may exceed 1.
For example, if you have 4 logical cores on your system, and a process is occupying 2 logical cores for an entire scrape interval, with this feature gate disabled a process.cpu.utilization
metric will be emitted with a value of 2. if this feature gate is enabled in the same scenario, the value of the emitted metric will be 0.5.
The schedule for this feature gate is:
- Introduced in v0.97.0 (March 2024) as
alpha
- disabled by default. - Moved to
beta
in v0.100.0 (May 2024) - enabled by default. - Moved to
stable
in v0.102.0 (June 2024) - cannot be disabled. - Removed three releases after
stable
.