Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[k8sclusterreceiver] add k8s.node.info metric #24835

Closed
povilasv opened this issue Aug 3, 2023 · 7 comments
Closed

[k8sclusterreceiver] add k8s.node.info metric #24835

povilasv opened this issue Aug 3, 2023 · 7 comments
Assignees
Labels
enhancement New feature or request

Comments

@povilasv
Copy link
Contributor

povilasv commented Aug 3, 2023

Component(s)

receiver/k8scluster

Is your feature request related to a problem? Please describe.

I would like to have a metric similar to kube_node_info of kube-state-metrics, see https://github.com/kubernetes/kube-state-metrics/blob/main/internal/store/node.go#L110-L136

Example from kube-state-metrics:

kube_node_info{container_runtime_version="containerd://1.6.9", kernel_version="6.4.6-arch1-1", kubelet_version="v1.25.3", kubeproxy_version="v1.25.3", node="kind-control-plane", os_image="Ubuntu 22.04.1 LTS", os_type="linux", pod_cidr="10.244.0.0/24", provider_id="kind://docker/kind/kind-control-plane", system_uuid="21db1dab-995a-4e51-8315-560f24222dd4"}

This helps me to correlate issues with nodes (kernel version, cri version, etc) .

Describe the solution you'd like

Adding a gauge k8s.node.info metric that always equals to 1.
I would add this information as resource attributes:

  • container.runtime.version
  • kubelet.version
  • kubeproxy.version
  • os.name
  • os.type

Describe alternatives you've considered

Additional context

No response

@povilasv povilasv added enhancement New feature or request needs triage New item requiring triage labels Aug 3, 2023
@github-actions
Copy link
Contributor

github-actions bot commented Aug 3, 2023

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@dmitryax
Copy link
Member

I don't believe OTel spec have any recommendations to move the Prometheus info metrics to OTel metrics. I believe those should be added as optional resource attributes to existing metrics

@jmichalek132
Copy link
Contributor

jmichalek132 commented Aug 25, 2023

@dmitryax that would be non-ideal implementation for backends such as Prometheus where storing these attributes on a separate metric reduces size of the index while still allows joining with the other metrics.

@dmitryax
Copy link
Member

Accommodating all possible metric backend peculiarities is not OTel's goal. These info metrics exist in the Prometheus ecosystem because there is no other way to do it in Prometheus.

This particular use case can be resolved by a collector processor or another Prometheus exporter capability that provides an option to move resource attributes into a separate fake metric instead of emitting them as native OTel metrics.

@dmitryax
Copy link
Member

Apparently prometheus exporter already exposes resource attributes in the target_info metric by default https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/exporter/prometheusexporter#setting-resource-attributes-as-metric-labels. If that's not enough, the configuration can be updated to be more flexible and allow compiling metrics like this

@povilasv
Copy link
Contributor Author

povilasv commented Sep 1, 2023

I've played around with adding a couple of resource attributes to node metrics and prometheus exporter #26351

With resource_to_telemetry_conversion enabled, the resource attributes appear as labels on node level metrics:

      prometheus:
        endpoint: "0.0.0.0:9090"
        resource_to_telemetry_conversion:
          enabled: true

Example metric:

k8s_node_allocatable_cpu{k8s_kubelet_version="v1.25.3",k8s_kubeproxy_version="v1.25.3",k8s_node_name="kind-control-plane",k8s_node_uid="09b55a47-87cb-4790-8834-2341d683999d",opencensus_resourcetype="k8s"} 8

While target_info looks like this:

target_info{http_scheme="http",instance="10.244.0.11:8888",job="opentelemetry-collector",net_host_name="10.244.0.11",net_host_port="8888"} 1

Without resource_to_telemetry_conversion, all of the resource attributes get lost:

Metrics

k8s_node_condition_ready 1

target_info looks like this:

target_info{http_scheme="http",instance="10.244.0.16:8888",job="opentelemetry-collector",net_host_name="10.244.0.16",net_host_port="8888"} 1

I think we need to figure out what to do about this in prometheus exporter or k8s cluster receiver. Would be great to get target_info's for each node, pod, etc with resource attributes.

@povilasv
Copy link
Contributor Author

povilasv commented Sep 4, 2023

I played around creating a k8s.node.info metric with transform processors and it is possible to do it now, without any collector changes:

      metricstransform/nodeinfo:
        transforms:
          - include: k8s.node.condition_ready
            # match_type specifies whether the include name should be used as a strict match or regexp match, default = strict
            match_type: strict
            action: insert
            new_name: k8s.node.info

      transform/metrics:
        error_mode: ignore
        metric_statements:
          - context: datapoint
            statements:
              - set(value_int, 1) where metric.name == "k8s.node.info"
              - set(attributes["k8s.kubeproxy.version"], resource.attributes["k8s.kubeproxy.version"]) where metric.name == "k8s.node.info"
              - set(attributes["k8s.kubelet.version"], resource.attributes["k8s.kubelet.version"]) where metric.name == "k8s.node.info"
          - context: metric
            statements:
              - delete_key(resource.attributes, "k8s.kubelet.version")
              - delete_key(resource.attributes, "k8s.kubeproxy.version")

Gives you these metrics:

 TYPE k8s_node_allocatable_cpu gauge
k8s_node_allocatable_cpu{k8s_node_name="kind-control-plane",k8s_node_uid="84095e3b-ebed-408d-a651-a4f469f34d06",opencensus_resourcetype="k8s"} 8
# HELP k8s_node_allocatable_memory Amount of memory allocatable on the node
# TYPE k8s_node_allocatable_memory gauge
k8s_node_allocatable_memory{k8s_node_name="kind-control-plane",k8s_node_uid="84095e3b-ebed-408d-a651-a4f469f34d06",opencensus_resourcetype="k8s"} 1.6405581824e+10
# HELP k8s_node_condition_memory_pressure MemoryPressure condition status of the node (true=1, false=0, unknown=-1)
# TYPE k8s_node_condition_memory_pressure gauge
k8s_node_condition_memory_pressure{k8s_node_name="kind-control-plane",k8s_node_uid="84095e3b-ebed-408d-a651-a4f469f34d06",opencensus_resourcetype="k8s"} 0
# HELP k8s_node_condition_ready Ready condition status of the node (true=1, false=0, unknown=-1)
# TYPE k8s_node_condition_ready gauge
k8s_node_condition_ready{k8s_node_name="kind-control-plane",k8s_node_uid="84095e3b-ebed-408d-a651-a4f469f34d06",opencensus_resourcetype="k8s"} 1
# HELP k8s_node_info Ready condition status of the node (true=1, false=0, unknown=-1)
# TYPE k8s_node_info gauge
k8s_node_info{k8s_kubelet_version="v1.25.3",k8s_kubeproxy_version="v1.25.3",k8s_node_name="kind-control-plane",k8s_node_uid="84095e3b-ebed-408d-a651-a4f469f34d06",opencensus_resourcetype="k8s"} 1

Only new k8s.node.info metric will have the version labels in Prometheus

TylerHelmuth pushed a commit that referenced this issue Sep 11, 2023
**Description:** <Describe what has changed.>
<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->

Add optional k8s.kubelet.version, k8s.kubeproxy.version node resource
attributes

Doing some actual testing with kind with k8s_cluster receiver and
prometheus exporters.

```
     k8s_cluster:                                                    
        node_conditions_to_report: [Ready, MemoryPressure]            
        allocatable_types_to_report: [cpu, memory]                    
        resource_attributes:                                                                      
          k8s.kubelet.version:
            enabled: true 
          k8s.kubeproxy.version:                                      
            enabled: true   
 ```  
and prometheus exporter:
```
   prometheus:
        resource_to_telemetry_conversion:
          enabled: true
        endpoint: 0.0.0.0:9090
```

Example metric:

```

k8s_node_allocatable_cpu{k8s_kubelet_version="v1.25.3",k8s_kubeproxy_version="v1.25.3",k8s_node_name="kind-control-plane",k8s_node_uid="09b55a47-87cb-4790-8834-2341d683999d",opencensus_resourcetype="k8s"}
8
```

**Link to tracking Issue:** 
#24835

**Testing:** <Describe what testing was performed and which tests were added.>

- added unit tests
- manua ltest with kind

**Documentation:** <Describe the documentation added.>
- generated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants