Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance kubeletstatsreceiver to scrape non-standard endpoints #26719

Open
asweet-confluent opened this issue Sep 16, 2023 · 17 comments
Open

Enhance kubeletstatsreceiver to scrape non-standard endpoints #26719

asweet-confluent opened this issue Sep 16, 2023 · 17 comments

Comments

@asweet-confluent
Copy link

asweet-confluent commented Sep 16, 2023

Component(s)

receiver/kubeletstats

Is your feature request related to a problem? Please describe.

kubeletstatsreceiver scrapes kubelet's /metrics Prometheus endpoint, but kubelet also exports other metrics at non-standard endpoints:

  • /metrics/cadvisor
  • /metrics/resource
  • /metrics/probes

Describe the solution you'd like

kubeletstatsreceiver should be enhanced to scrape those other endpoints. This is what's done by Datadog's kubelet integration - see the config here.

I've compiled a list of metrics from the source code as well as direct queries to the endpoints. Note that this may not be an exhaustive list:

Metric List
# Log metrics
kubelet_container_log_filesystem_used_bytes

# Resource metrics
node_cpu_usage_seconds_total
node_memory_working_set_bytes
container_cpu_usage_seconds_total
container_memory_working_set_bytes
pod_cpu_usage_seconds_total
pod_memory_working_set_bytes
scrape_error
container_start_time_seconds

# Volume metrics
volume_stats_capacity_bytes
volume_stats_available_bytes
volume_stats_used_bytes
volume_stats_inodes
volume_stats_inodes_free
volume_stats_inodes_used
volume_stats_health_status_abnormal


node_startup_pre_kubelet_duration_seconds
node_startup_pre_registration_duration_seconds
node_startup_registration_duration_seconds
node_startup_post_registration_duration_seconds
node_startup_duration_seconds
pod_worker_duration_seconds
pod_start_duration_seconds
pod_start_sli_duration_seconds
cgroup_manager_duration_seconds
pod_worker_start_duration_seconds
pod_status_sync_duration_seconds
pleg_relist_duration_seconds
pleg_discard_events
pleg_relist_interval_seconds
pleg_last_seen_seconds
evented_pleg_connection_error_count
evented_pleg_connection_success_count
evented_pleg_connection_latency_seconds
evictions
eviction_stats_age_seconds
preemptions
running_pods
running_containers
desired_pods
active_pods
mirror_pods
working_pods
orphaned_runtime_pods_total
restarted_pods_total

# Metrics keys of remote runtime operations
runtime_operations_total
runtime_operations_duration_seconds
runtime_operations_errors_total
# Metrics keys of device plugin operations
device_plugin_registration_total
device_plugin_alloc_duration_seconds

# Metrics keys of pod resources operations
pod_resources_endpoint_requests_total
pod_resources_endpoint_requests_list
pod_resources_endpoint_requests_get_allocatable
pod_resources_endpoint_errors_list
pod_resources_endpoint_errors_get_allocatable
pod_resources_endpoint_requests_get
pod_resources_endpoint_errors_get

# Metrics keys for RuntimeClass
run_podsandbox_duration_seconds
run_podsandbox_errors_total

# Metrics to keep track of total number of Pods and Containers started
started_pods_total
started_pods_errors_total
started_containers_total
started_containers_errors_total

# Metrics to track HostProcess container usage by this kubelet
started_host_process_containers_total
started_host_process_containers_errors_total

# Metrics to track ephemeral container usage by this kubelet
managed_ephemeral_containers

# Metrics to track the CPU manager behavior
cpu_manager_pinning_requests_total
cpu_manager_pinning_errors_total

# Metrics to track the Topology manager behavior
topology_manager_admission_requests_total
topology_manager_admission_errors_total
topology_manager_admission_duration_ms

# Metrics to track orphan pod cleanup
orphan_pod_cleaned_volumes
orphan_pod_cleaned_volumes_errors

# Metric list directly from /metrics/cadvisor
cadvisor_version_info
container_cpu_cfs_periods_total
container_cpu_cfs_throttled_periods_total
container_cpu_cfs_throttled_seconds_total
container_cpu_load_average_10s
container_cpu_system_seconds_total
container_cpu_usage_seconds_total
container_cpu_user_seconds_total
container_file_descriptors
container_fs_inodes_free
container_fs_inodes_total
container_fs_io_current
container_fs_io_time_seconds_total
container_fs_io_time_weighted_seconds_total
container_fs_limit_bytes
container_fs_read_seconds_total
container_fs_reads_merged_total
container_fs_reads_total
container_fs_sector_reads_total
container_fs_sector_writes_total
container_fs_usage_bytes
container_fs_write_seconds_total
container_fs_writes_merged_total
container_fs_writes_total
container_last_seen
container_memory_cache
container_memory_failcnt
container_memory_failures_total
container_memory_mapped_file
container_memory_max_usage_bytes
container_memory_rss
container_memory_swap
container_memory_usage_bytes
container_memory_working_set_bytes
container_network_receive_bytes_total
container_network_receive_errors_total
container_network_receive_packets_dropped_total
container_network_receive_packets_total
container_network_transmit_bytes_total
container_network_transmit_errors_total
container_network_transmit_packets_dropped_total
container_network_transmit_packets_total
container_oom_events_total
container_processes
container_sockets
container_spec_cpu_period
container_spec_cpu_quota
container_spec_cpu_shares
container_spec_memory_limit_bytes
container_spec_memory_reservation_limit_bytes
container_spec_memory_swap_limit_bytes
container_start_time_seconds
container_tasks_state
container_threads
container_threads_max
container_ulimits_soft
machine_cpu_cores
machine_cpu_physical_cores
machine_cpu_sockets
machine_memory_bytes
machine_nvm_avg_power_budget_watts
machine_nvm_capacity


# Metric list directly from /metrics/probes
prober_probe_duration_seconds_bucket
prober_probe_duration_seconds_count
prober_probe_duration_seconds_sum
prober_probe_total

Describe alternatives you've considered

As a workaround, you can configure Prometheus scrape jobs to hit those endpoints. This is not ideal because kubeletstatsreceiver renames the default metric attributes, e.g. namespace becomes k8s.namespace.name. Mixing kubeletstatsreceiver and Prometheus scrape jobs would create disjointed label sets unless you add a separate processing step that renames them.

Additional context

No response

@asweet-confluent asweet-confluent added enhancement New feature or request needs triage New item requiring triage labels Sep 16, 2023
@github-actions
Copy link
Contributor

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@TylerHelmuth
Copy link
Member

@asweet-confluent sounds like a reasonable idea to me. Can you provide in this issue the metrics we'd be collecting? Are there any important differences between those endpoints and the stats/summary data we collect today?

@asweet-confluent
Copy link
Author

Can you provide in this issue the metrics we'd be collecting?

I updated the issue description with the raw metric names, presumably kubeletstatsreceiver will rename them to be in line with the k8s. metric naming scheme.

Are there any important differences between those endpoints and the stats/summary data we collect today?

As noted in the K8S docs:

Those metrics do not have the same lifecycle.

I think the cadvisor metrics come directly from cadvisor itself so that makes sense.

Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label Dec 25, 2023
Copy link
Contributor

This issue has been closed as inactive because it has been stale for 120 days with no activity.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Feb 23, 2024
@cmergenthaler
Copy link
Contributor

@asweet-confluent Can you please reopen the issue? I think the missing metrics are really necessary in order to make the receiver complete

@diranged
Copy link

Agreed - can this get re-opened @asweet-confluent?

@ChrsMark
Copy link
Member

Aren't some metrics listed in the Metric List provided in the issue description already provided by the receiver?

If I remember correctly, the kubelet's /stats/summary endpoint provides metrics that can be partially coming from cAdvisor. There is also the option to collect directly from the CRI but it's still behind a feature flag: https://kubernetes.io/docs/reference/instrumentation/cri-pod-container-metrics/

We can consider getting additional metrics from other endpoints, but I believe we should be selective to metrics that are actually important. Once we have this specific list of metrics we could gradually start discussing them as part of the open-telemetry/semantic-conventions#1032 as well.

On a slightly different note there were several discussions around these endpoints over the past years, so we would need to verify we are aligned with the most recent update. Some refs:

/cc @dashpole

@dashpole
Copy link
Contributor

I don't think we should support scraping prometheus endpoints in the kubelet stats receiver.

You can see the proposal behind the CRI-direct feature here: https://github.com/kubernetes/enhancements/tree/6f648005d3b10d9c24984d139f96077f720726f7/keps/sig-node/2371-cri-pod-container-stats

That would be a good option to consider after it graduates to beta.

@diranged
Copy link

I don't think we should support scraping prometheus endpoints in the kubelet stats receiver.

Can you elaborate on why? I like the CRI active approach for sure - but I see that as unrelated to fully supporting kubelet stats. The kubelet stats approach is generic and easier to implement on the operator side (less permissions/volume mount configuration)..

@dashpole
Copy link
Contributor

The prometheus receiver already supports the endpoints in question. Given how large the Prometheus ecosystem is, it doesn't seem sustainable to have specific receivers to translate from prometheus conventions to OTel conventions for each source of Prometheus metrics.

@diranged
Copy link

The prometheus receiver already supports the endpoints in question. Given how large the Prometheus ecosystem is, it doesn't seem sustainable to have specific receivers to translate from prometheus conventions to OTel conventions for each source of Prometheus metrics.

Given that - I might argue that the kubelet receiver then should be deprecated. I think it's worse to have half a solution than no dedicated solution at all.

I do like the idea of using the kubelet receiver because it's simpler to configure and standardized the metric names that are exported into something otel specific though..

@alexgenon
Copy link

Hi everyone,
While it's indeed possible to scrape those metrics using the Prometheus receiver (and that's how we are currently scraping them), we're not fully satisfied with this approach. We end-up having timeseries in Prometheus/OpenMetric format converted to OpenTelemetry (which is not straightforward) and we miss the opportunity to have a native OpenTelemetry metrics where we can better structure the Resource Attributes (instead of only relying on the target_info metric) and we can enforce the semantic convention.

Receivers such as kubeletstat or k8scluster are perfect fit for a robust collection of k8s metrics.

I agree with @diranged's on the fact that the kubeletstat receiver (but also the k8scluster receiver) are limiting in their current states. Deprecating them might be too radical as they provide an easy way to get started by k8s o11y. But we should at least document their limitations and recommend going with the Prometheus scraping if more metrics are required.

@ChrsMark
Copy link
Member

ChrsMark commented Aug 6, 2024

My question from #26719 (comment) is still valid here:

I think we still miss a well defined proposal which lists specific metrics that are not provided by the kubeletstats receiver (which scrapes the /stats/summary endpoint).

In addition, I think I'd agree with what @dashpole mentioned at #26719 (comment). The kubeletstats receiver scrapes a specific endpoint offering a selective set of metrics today. I'm not sure if expanding to scraping one or more additional endpoints is a good choice here. I'm not sure if it's done in other receivers but this can be problematic when it comes to maintenance, deprecation handling etc. So if we really need to collect these metrics maybe we need to find a way to differentiate their collection either in a standalone different receiver component or by splitting in multiple scrapers like it's done in the hostmetricsreceiver.

Last but not least, standardizing a prometheus based input on top of the prometheus receiver sounds like a good example for open-telemetry/opentelemetry-collector#8372.

@alexgenon
Copy link

Sorry for my late reply.
Thanks @ChrsMark for the detailed answer. I'll try to give a naïve user point on view.

I agree with you that we should start by listing which metrics we miss with the kubeletstats and k8scluster receivers and make sure they are part of the semantic convention. We'll do this exercise within my team and maybe post them as a comment on open-telemetry/semantic-conventions#1032. What do you think ?

As for the way it should be implemented, multiple scrapers on the same receiver would make sense.
We're also using the hostmetricsreceiver and we're happy with the way it works.

Our objectives is to use OpenTelemetry as much as possible within our observability pipeline to avoid any conversion issue. Scraping the Prometheus endpoint and having the collector doing the conversion to otlp can be cumbersome. In the current situation, we have some metrics coming via the kubeletstats receivers and others via the scraping of Prometheus endpoints on kubelet and cadvisor, this hybrid solution requires extra efforts during the setup of a monitoring solution.

@ChrsMark
Copy link
Member

We'll do this exercise within my team and maybe post them as a comment on open-telemetry/semantic-conventions#1032. What do you think ?

That would make sense. You could also create a standalone issue to propose this new batch of metrics and link back to open-telemetry/semantic-conventions#1032 (we can use that issue as a meta issue).

TBH though, regarding the implementation we would need to think of the details thoroughly. As I mentioned already maybe the work for supporting templates on open-telemetry/opentelemetry-collector#8372 can help here.

Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label Oct 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants