-
Notifications
You must be signed in to change notification settings - Fork 0
3.1.3.2 Publishing system and cluster metrics using Netdata
This method of acquiring system and K8S metric values involves the deployment of one Netdata agent at every K8S cluster node. Netdata is an open source software for collecting metrics, displaying them as charts, but also providing them through a REST API. The default Nebulous application deployment scenario installs Netdata agents along with EMS at application clusters. EPAs will periodically contact the REST API server of each Netdata agent and scrape the required metrics. To enable EPAs scrape the Netdata agents, it is required that the application metric model provides the needed configuration. For each raw metric that will have its values using this method, it is necessary to define a sensor of “netdata” type and provide the corresponding configuration (including the scraping period).
In order to define a raw metric that takes its values from Netdata agents, the netdata
type must be entered in the Sensor field in Nebulous GUI.
This will instruct EPA to use its K8S Netdata collector plugin for retrieving the values. Under the hood the K8S Netdata collector plugin will build a
URL of the form http://<NODE_IP_ADDRESS>:<PORT>/<PATH>?<QUERY_PARAMS>&format=ssv
and attempt to retrieve the relevent JSON response. Following, it will extract the
value(s) of the metric of interest (see next) and publish it/them as the raw metric's value(s) in EPA broker. If needed, it will also aggreate multiple values into a single one.
In order to build the URL, the collector plugin will use the provided configuration settings, or the corresponding defaults.
If the metric of interest is a K8S metric (its names starts with k8s.
) the collector plugin can take into consideration the pod name and namespace.
The metric(s) of interest must be provided in the configuration.
The configuration comprises a few settings used to guide the collector plugin, while the remaining are used to build the QUERY_PARAMS
list of the URL.
The plugin-specific configuration settings, along with their respective defaults, are:
Plugin Setting | Type | Default value | Comments |
---|---|---|---|
endpoint | String |
/api/v2/data |
The <PATH> part of the URL. Only the v2 version has been tested. |
port | Port |
19999 |
The <PORT> part of the URL. Allowed values: 1..65535 . |
components | String |
component name | In case of K8S metric of interest, specifies which pod(s) to pick. If left empty it will pick all pods in the namespace. If omitted it will use the name of component(s) the raw metric applies. |
namespace | String |
default |
In case of K8S metric of interest, specifies the pod namespace to use. If left empty it will pods from all namespaces. If omitted it defaults to default namespace. |
results-aggregation | Enum |
no default | Allowed values: SUM , AVERAGE , COUNT , MIN , MAX , NONE . If omitted or is NONE individual events will be published for each metric value. |
intervalPeriod | Positive Integer |
60 |
How often to scrape the prometheus/OpenMetrics endpoint. |
intervalUnit | Enum |
SECONDS |
The time unit of intervalPeriod . Allowed values: SECONDS , MINUTES , HOURS , DAYS
|
If intervalPeriod
is omitted, the scraping period is taken from the metric's Output interval and unit fields. If they are not specified either, it is assumed to be 60 seconds.
The settings used to build the Netdata URL, along with their respective defaults, are:
Netdata Setting | Type | Default value | Comments |
---|---|---|---|
scope_contexts | String |
no default | REQUIRED: The metric(s) of interest to extract. Can be a comma-separated list |
context | String |
no default | Can be used instead of scope_contexts . Check Netdata documentation for details |
dimension | String |
* |
The scope_context dimensions to use |
after | Long |
-1 |
Selected the measurement of the last second |
time_group | Enum |
average |
This parameters defines the method of grouping of multiple measurements |
The settings in the table above are always added in the URL (either the provided value or the default). Any additional settings (not listed in tables above) will also be included in query parameters list.
For a complete list of the supported query parameters, and their semantics, please consult the official Netdata documentation on the topic.
Netdata metrics pertaining to Kubernetes (pods, nodes etc) are named using the k8s.
prefix. For instance, k8s.cgroup.cpu
.
When a metric can be measured per pod, there will be a measurement (value) for each pod in every Netdata response (e.g. CPU consumed by each pod).
The K8S Netdata collector plugin will filter these values in order to retain only those complying to the raw metric specification.
NOTE:
The K8S Netdata collector plugin will attempt to extract metric values from the response section underview.dimensions.ids
andview.dimensions.values
. If the metric of interest is a K8S metric, the id's represent different pods (running at various namespaces). Plugin will filter id's (i.e. pods) based on the provided component name(s) and namespace.IMPORTANT:
Pods names may be different than component names (as they appear in metric model). For instance using Helm will prepend the deployment name in front of each component name to generate pod names. In this case it is essential to set thecomponents
setting with a value including the Helm deployment prefix (or any other deviation).
The following table details the outcome of each possible combination of components
and namespace
configuration settings.
For each pod selected, the corresponding measurement (metric value) will be kept for further processing. The rest will be filtered out.
components | namespace | Pods selected |
---|---|---|
provided | provided | Pod with name included in components list, and running in the namespace specified in namespace setting |
blank | provided | All pods running in the namespace specified in namespace setting |
omitted | provided | Pods named exactly as one of the metric model components where raw metric applies to, and running in the namespace specified in namespace setting |
provided | blank | Pod with name included in components list, running in any namespace |
blank | blank | All pods, running in any namespace |
omitted | blank | Pods named exactly as one of the metric model components where raw metric applies to, and running in any namespace |
provided | omitted | Pod with name included in components list, running in namespace named default
|
blank | omitted | All pods, running in namespace named default
|
omitted | omitted | Pods named exactly as one of the metric model components where raw metric applies to, and running in namespace named default
|
The values retained (each one pertains to one pod), can either:
- be aggregated according to
result-aggregation
value, and then published as a raw metric value, or - immediately published as a raw metric value.
In the former case the event conveying the raw metric value will also have the property destination-key
with the respective pod name and namespace.
Sample list of Netdata metrics. It may vary between devices based on the hardware, architecture, OS, but also the installed software.
..........TBD
[1] Netdata site [2] Netdata Queries/Lookup
Funded by the European Union. Views and opinions expressed are however those of the author(s) only and do not necessarily reflect those of the European Union or the Directorate-General for Communications Networks, Content and Technology. Neither the European Union nor the granting authority can be held responsible for them.
© 2024 NEBULOUS. ALL RIGHTS RESERVED