Errors in agent logs #697

KalebHawkins · 2023-03-16T16:05:50Z

Hello, I am trying to use the collector on an Openshift 4.11 cluster to forward logs to Splunk. I am using chart version 0.70. Using a pretty basic configuration file.

splunk:
  clusterName: "<REDACTED>"
  splunkPlatform:
    endpoint: "<REDACTED>"
    token: "<REDACTED>"
    index: "<REDACTED>"
    metricsIndex: "<REDACTED>"
    tracesIndex: "<REDACTED>"
    logsEnabled: true
    metricsEnabled: true
    tracesEnabled: false
  logsEngine: otel
  distribution: "openshift"
  environment: <REDACTED>

  agent:
    enabled: true
    resources:
      limits:
        #cpu: 1000m
        memory: 4Gi
        
  clusterReceiver:
    enabled: true
    resources:
      limits:
        #cpu: 1000m
        memory: 8Gi

  logsCollection:
    containers:
      enabled: true
      containerRuntime: "cri-o"

The agents get the following error messages in their logging. I am wondering if this is an Openshift thing or if I have something misconfigured.

2023-03-16T15:53:22.447Z error prometheusexporter/prometheus.go:139 Could not get prometheus metrics {"kind": "receiver", "name": "receiver_creator", "pipeline": "metrics", "name": "smartagent/kubernetes-scheduler/receiver_creator{endpoint=\"xxx.xxx.xxx.xxx\"}/k8s_observer/ebe4c92e-8b5d-4be3-9945-cd439c5825c8", "monitorID": "smartagentkubernetesschedulerreceiver_creatorendpoint2071306682k8s_observerebe4c92e8b5d4be39945cd439c5825c8", "error": "Get \"http://xxx.xxx.xxx.xxx:10251/metrics\": dial tcp xxx.xxx.xxx.xxx:10251: connect: connection refused", "monitorType": "kubernetes-scheduler"}

2023-03-16T15:53:22.550Z error prometheusexporter/prometheus.go:139 Could not get prometheus metrics {"kind": "receiver", "name": "receiver_creator", "pipeline": "metrics", "name": "smartagent/kubernetes-proxy/receiver_creator{endpoint=\"xxx.xxx.xxx.xxx\"}/k8s_observer/d742b91e-685e-4c35-8197-0b56ebc88e39", "monitorID": "smartagentkubernetesproxyreceiver_creatorendpoint2071306682k8s_observerd742b91e685e4c3581970b56ebc88e39", "monitorType": "kubernetes-proxy", "error": "Get \"http://xxx.xxx.xxx.xxx:29101/metrics\": dial tcp xxx.xxx.xxx.xxx:29101: connect: connection refused"}

The text was updated successfully, but these errors were encountered:

jvoravong · 2023-03-16T17:34:24Z

We need to update receiver configurations to match any updates the Openshift kube-scheduler and kube-poxy may have recently received.

You can disable the affected receivers as a temporary solution with these values.

agent.controlPlaneMetrics.proxy.enabled: false
agent.controlPlaneMetrics.scheduler.enabled: false

KalebHawkins · 2023-03-16T17:43:00Z

I just tested that configuration. It did get rid of the errors. Thanks.

aligthart · 2023-03-20T09:25:05Z

We noticed the same (chart version 0.72, k8s 1.23 installed with kops).

The helm chart hardcodes port 10251. However this port has been deprecated.

From: https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.17.md#cluster-lifecycle-1

Kubeadm: enable the usage of the secure kube-scheduler and kube-controller-manager ports for health checks. 
For kube-scheduler was 10251, becomes 10259. 
For kube-controller-manager was 10252, becomes 10257. 
(https://github.com/kubernetes/kubernetes/pull/85043, [@neolit123](https://github.com/neolit123))

kishah-lilly · 2023-03-23T14:01:48Z

@jvoravong do you have a timeline when this will be fixed?

atoulme · 2023-03-23T16:58:13Z

We typically do not communicate timelines or commit to resolution times on Github if we can. Please make sure to open a support case if you are encountering this issue, so we can best help you.

atoulme · 2023-04-05T06:56:35Z

Closing this as fixed.

kishah-lilly · 2023-04-27T17:24:00Z

@jvoravong @atoulme
The change implemented here https://github.com/signalfx/splunk-otel-collector-chart/pull/711/files fixes the kubernetes-scheduler issue however, it does not fix the kubernetes-proxy issue.

From original post:

2023-03-16T15:53:22.550Z error prometheusexporter/prometheus.go:139 Could not get prometheus metrics {"kind": "receiver", "name": "receiver_creator", "pipeline": "metrics", "name": "smartagent/kubernetes-proxy/receiver_creator{endpoint="xxx.xxx.xxx.xxx"}/k8s_observer/d742b91e-685e-4c35-8197-0b56ebc88e39", "monitorID": "smartagentkubernetesproxyreceiver_creatorendpoint2071306682k8s_observerd742b91e685e4c3581970b56ebc88e39", "monitorType": "kubernetes-proxy", "error": "Get "http://xxx.xxx.xxx.xxx:29101/metrics\": dial tcp xxx.xxx.xxx.xxx:29101: connect: connection refused"}

atoulme · 2023-04-27T17:45:09Z

Moved to a separate issue, #758

jvoravong added the bug Something isn't working label Mar 16, 2023

jvoravong self-assigned this Mar 20, 2023

jvoravong mentioned this issue Mar 28, 2023

Update the port used to monitor the Kubernetes scheduler (10251->10259) #711

Merged

atoulme closed this as completed Apr 5, 2023

atoulme mentioned this issue Apr 27, 2023

Error connecting to kubernetes-proxy #758

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Errors in agent logs #697

Errors in agent logs #697

KalebHawkins commented Mar 16, 2023

jvoravong commented Mar 16, 2023

KalebHawkins commented Mar 16, 2023

aligthart commented Mar 20, 2023 •

edited

Loading

kishah-lilly commented Mar 23, 2023

atoulme commented Mar 23, 2023

atoulme commented Apr 5, 2023

kishah-lilly commented Apr 27, 2023

atoulme commented Apr 27, 2023

Errors in agent logs #697

Errors in agent logs #697

Comments

KalebHawkins commented Mar 16, 2023

jvoravong commented Mar 16, 2023

KalebHawkins commented Mar 16, 2023

aligthart commented Mar 20, 2023 • edited Loading

kishah-lilly commented Mar 23, 2023

atoulme commented Mar 23, 2023

atoulme commented Apr 5, 2023

kishah-lilly commented Apr 27, 2023

atoulme commented Apr 27, 2023

aligthart commented Mar 20, 2023 •

edited

Loading