Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] AKS managed prometheus ama-metrics-node randomly fails to resolve address #4672

Open
PAKalucki opened this issue Nov 26, 2024 · 0 comments
Labels

Comments

@PAKalucki
Copy link

Describe the bug
ama-metrics-node prometheus-collector randomly fails to resolve address which eventually results in restart. I am using all default out of the box settings that come with enabling Managed Prometheus on AKS cluster. I am facing this behaviour randomly on multiple clusters in separated environments. I am not sure what is the address that fails to resolve but I have verified with kubectl debug that I can resolve popular addresses like google.com from this node.

Expected behavior
ama-metrics-node should not fail randomly due to resolution errors.

Environment (please complete the following information):

  • Kubernetes version 1.29.10

Additional context
Logs:

Starting inotify for watching config map update
�[0;36mMODE�[0m=advanced
�[0;36mCONTROLLER_TYPE�[0m=DaemonSet
�[0;36mCLUSTER�[0m=/subscriptions/xyz/resourceGroups/dev-xyz-eastus-rg/providers/Microsoft.ContainerService/managedClusters/dev-xyz-xyz-cluster
Setting telemetry output to the default azurepubliccloud instance
�[0;36mhttp_proxy�[0m=
�[0;36mHTTP_PROXY�[0m=
�[0;36mhttps_proxy�[0m=
�[0;36mHTTPS_PROXY�[0m=
�[0;36mNO_PROXY�[0m=,ama-metrics-operator-targets.kube-system.svc.cluster.local
�[0;36mno_proxy�[0m=,ama-metrics-operator-targets.kube-system.svc.cluster.local
�[0;36mHTTP_PROXY_ENABLED�[0m=false
�[0;36mAZMON_AGENT_CFG_FILE_VERSION�[0m=ver1
�[0;36mAZMON_AGENT_CFG_SCHEMA_VERSION�[0m=v1
***********************�[0;36mStart Processing - parseSettingsForPodAnnotations�[0m***********************
Using configmap namespace regex for podannotations: .*
Writing configuration to file: /opt/microsoft/configmapparser/config_def_pod_annotation_based_scraping
Writing to file: AZMON_PROMETHEUS_POD_ANNOTATION_NAMESPACES_REGEX='.*'
AZMON_PROMETHEUS_POD_ANNOTATION_SCRAPING_ENABLED=true
Configuration written to file successfully.
***********************�[0;36mEnd Processing - parseSettingsForPodAnnotations�[0m***********************
***********************�[0;36mStart Processing - parsePrometheusCollectorConfig�[0m***********************
Configure:Print the value of AZMON_AGENT_CFG_SCHEMA_VERSION: v1
Got configmap setting for cluster_alias: ""
After replacing non-alpha-numeric characters with '_': 
AZMON_CLUSTER_ALIAS: ''
AZMON_CLUSTER_LABEL: dev-xyz-xyz-cluster
***********************�[0;36mEnd Processing - parsePrometheusCollectorConfig�[0m***********************
***********************�[0;36mStart Processing - parseDefaultScrapeSettings�[0m***********************
Start prometheus-collector-settings Processing
using configmap for default scrape settings...
config::Using scrape settings for kubelet: true
config::Using scrape settings for coredns: false
config::Using scrape settings for cadvisor: true
config::Using scrape settings for kubeproxy: false
config::Using scrape settings for apiserver: false
config::Using scrape settings for kubestate: true
config::Using scrape settings for nodeexporter: true
config::Using scrape settings for prometheuscollectorhealth: false
config::Using scrape settings for windowsexporter: true
config::Using scrape settings for windowskubeproxy: true
config::Using scrape settings for kappiebasic: true
config::Using scrape settings for networkobservabilityRetina: true
config::Using scrape settings for networkobservabilityHubble: true
config::Using scrape settings for networkobservabilityCilium: true
After replacing non-alpha-numeric characters with '_': dev_xyz_xyz_cluster
End prometheus-collector-settings Processing
***********************�[0;36mEnd Processing - parseDefaultScrapeSettings�[0m***********************
***********************�[0;36mStart Processing - parseDebugModeSettings�[0m***********************
Using configmap setting for debug mode: false
Setting debug mode environment variable: false
***********************�[0;36mEnd Processing - parseDebugModeSettings�[0m***********************
***********************�[0;36mStart Processing - tomlparserTargetsMetricsKeepList�[0m***********************
Parsed config map for default-targets-metrics-keep-list: map[apiserver: cadvisor: coredns: kappiebasic: kubelet: kubeproxy: kubestate: minimalingestionprofile:true networkobservabilityCilium: networkobservabilityHubble: networkobservabilityRetina: nodeexporter: podannotations: windowsexporter: windowskubeproxy:]
***********************�[0;36mEnd Processing - tomlparserTargetsMetricsKeepList�[0m***********************
***********************�[0;36mStart Processing - tomlparserScrapeInterval�[0m***********************
***********************�[0;36mEnd Processing - tomlparserScrapeInterval�[0m***********************
***********************�[0;36mStart Processing - prometheusConfigMerger�[0m***********************
�[0;33mCustom prometheus config does not exist, using only default scrape targets if they are enabled�[0m
Updating scrape interval config for kubeletDefaultDs.yml
scrapeInterval 30s
Adding keep list regex or minimal ingestion regex for kubeletDefaultDs.yml
Updating scrape interval config for cadvisorDefaultDs.yml
scrapeInterval 30s
Adding keep list regex or minimal ingestion regex for cadvisorDefaultDs.yml
Updating scrape interval config for nodeexporterDefaultDs.yml
scrapeInterval 30s
Adding keep list regex or minimal ingestion regex for nodeexporterDefaultDs.yml
Updating scrape interval config for kappieBasicDefaultDs.yml
scrapeInterval 30s
Adding keep list regex or minimal ingestion regex for kappieBasicDefaultDs.yml
Updating scrape interval config for networkobservabilityRetinaDefaultDs.yml
scrapeInterval 30s
Adding keep list regex or minimal ingestion regex for networkobservabilityRetinaDefaultDs.yml
Updating scrape interval config for networkobservabilityHubbleDefaultDs.yml
scrapeInterval 30s
Adding keep list regex or minimal ingestion regex for networkobservabilityHubbleDefaultDs.yml
Updating scrape interval config for networkobservabilityCiliumDefaultDs.yml
scrapeInterval 30s
Done merging 7 default prometheus config(s)
Starting to merge default prometheus config values in collector template as backup
***********************�[0;36mEnd Processing - prometheusConfigMerger, Done Writing Default Prometheus Config�[0m***********************
�[0;36mAZMON_INVALID_CUSTOM_PROMETHEUS_CONFIG�[0m=false
�[0;36mCONFIG_VALIDATOR_RUNNING_IN_AGENT�[0m=true
prom-config-validator::No custom prometheus config found. Only using default scrape configs
prom-config-validator::Config file provided - /opt/defaultsMergedConfig.yml
prom-config-validator::Successfully generated otel config
prom-config-validator::Loading configuration...
prom-config-validator::Successfully loaded and validated prometheus config
prom-config-validator::Prometheus default scrape config validation succeeded, using this as collector config
�[0;36mAZMON_USE_DEFAULT_PROMETHEUS_CONFIG�[0m=true
prom-config-validator::Use default prometheus config: true
meConfigFile: /usr/sbin/me_ds.config
fluentBitConfigFile: /opt/fluent-bit/fluent-bit.conf
2024/11/26 13:04:18 checking health of token adapter after 1 secs
2024/11/26 13:04:18 found token adapter to be healthy after 1 secs
2024/11/26 13:04:18 export tokenadapterHealthyAfterSecs=1
�[0;36mME_CONFIG_FILE�[0m=/usr/sbin/me_ds.config
�[0;36mcustomResourceId�[0m=/subscriptions/xyz/resourceGroups/dev-xyz-eastus-rg/providers/Microsoft.ContainerService/managedClusters/dev-xyz-xyz-cluster
�[0;36mcustomRegion�[0m=eastus
Waiting for 10s for token adapter sidecar to be up and running so that it can start serving IMDS requests
Starting MDSD
1.30.3
MDSD_VERSION=""
Waiting for 30s for MDSD to get the config and put them in place for ME
Starting Metrics Extension with config overrides
Starting otelcollector with only default scrape configs enabled
startCommand otelcollector
ME_VERSION="2.2024.823.1539-1.cm2
"
GOLANG_VERSION="go version go1.21.13 linux/amd64
"
OTELCOLLECTOR_VERSION="custom-collector-distro version 0.99.0
"
PROMETHEUS_VERSION="2.51.2
"
Starting fluent-bit
�[0;36mFLUENT_BIT_VERSION�[0m=Fluent Bit v2.1.10
Git commit: 

Starting Telegraf
TELEGRAF_VERSION=1.28.5-5.cm2

Starting inotify for watching mdsd config update
AZMON_CONTAINER_START_TIME=1732626301
AZMON_CONTAINER_START_TIME_READABLE="2024-11-26T13:05:01Z"
File Doesnt Exist. Creating file...
Fluent Bit v2.1.10
* Copyright (C) 2015-2022 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

2024-11-26T13:05:04Z I! Loading config: /opt/telegraf/telegraf-prometheus-collector-ds.conf
{"time":1732626326.891959,"filepath":"/opt/microsoft/linuxmonagent/mdsd.err","log":"2024-11-26T13:05:26.8914390Z: Failed to upload to ODS: Error resolving address, Datatype: HEALTH_ASSESSMENT_BLOB, RequestId: abb2ff20-4534-4e77-947e-d73e5b2627b2"}
{"time":1732626386.90361,"filepath":"/opt/microsoft/linuxmonagent/mdsd.err","log":"2024-11-26T13:06:26.9033250Z: Failed to upload to ODS: Error resolving address, Datatype: HEALTH_ASSESSMENT_BLOB, RequestId: acda5f66-6386-4409-b33f-949a0e0f2041"}
{"time":1732626418.501357,"filepath":"/opt/microsoft/linuxmonagent/mdsd.err","log":"2024-11-26T13:06:58.5012590Z: no ExtensionInfo object found in cache for pid=281"}
{"time":1732626446.902178,"filepath":"/opt/microsoft/linuxmonagent/mdsd.err","log":"2024-11-26T13:07:26.8995760Z: Failed to upload to ODS: Error resolving address, Datatype: HEALTH_ASSESSMENT_BLOB, RequestId: 42d81ca7-0ecb-4047-8fcb-f28808e0a039"}
{"time":1732626506.894271,"filepath":"/opt/microsoft/linuxmonagent/mdsd.err","log":"2024-11-26T13:08:26.8940080Z: Failed to upload to ODS: Error resolving address, Datatype: HEALTH_ASSESSMENT_BLOB, RequestId: 7d5f8a9c-a62c-46e3-8a0d-f8e51cbf4546"}
{"time":1732626566.883804,"filepath":"/opt/microsoft/linuxmonagent/mdsd.err","log":"2024-11-26T13:09:26.8836220Z: Failed to upload to ODS: Error resolving address, Datatype: HEALTH_ASSESSMENT_BLOB, RequestId: 58e79352-5ca3-478d-87ae-09e99709001d"}
{"time":1732626626.888222,"filepath":"/opt/microsoft/linuxmonagent/mdsd.err","log":"2024-11-26T13:10:26.8880390Z: Failed to upload to ODS: Error resolving address, Datatype: HEALTH_ASSESSMENT_BLOB, RequestId: ac1eb812-e75c-4166-a065-90d8126c9ce8"}
{"time":1732626686.883075,"filepath":"/opt/microsoft/linuxmonagent/mdsd.err","log":"2024-11-26T13:11:26.8827120Z: Failed to upload to ODS: Error resolving address, Datatype: HEALTH_ASSESSMENT_BLOB, RequestId: f56ec8ca-558c-439c-97d8-aa345c011619"}
{"time":1732626746.883008,"filepath":"/opt/microsoft/linuxmonagent/mdsd.err","log":"2024-11-26T13:12:26.8828520Z: Failed to upload to ODS: Error resolving address, Datatype: HEALTH_ASSESSMENT_BLOB, RequestId: 81da2d30-67f8-4e1b-ac3d-9d0c504b9f82"}
{"time":1732626806.892248,"filepath":"/opt/microsoft/linuxmonagent/mdsd.err","log":"2024-11-26T13:13:26.8920690Z: Failed to upload to ODS: Error resolving address, Datatype: HEALTH_ASSESSMENT_BLOB, RequestId: f21357e3-2aae-4205-9d13-166028accd6b"}
{"time":1732626866.883172,"filepath":"/opt/microsoft/linuxmonagent/mdsd.err","log":"2024-11-26T13:14:26.8829690Z: Failed to upload to ODS: Error resolving address, Datatype: HEALTH_ASSESSMENT_BLOB, RequestId: a614de2b-33f1-4461-a44f-0181d97b1bc4"}
{"time":1732626926.904568,"filepath":"/opt/microsoft/linuxmonagent/mdsd.err","log":"2024-11-26T13:15:26.9043650Z: Failed to upload to ODS: Error resolving address, Datatype: HEALTH_ASSESSMENT_BLOB, RequestId: 62cc57a5-17de-4e4e-92a6-af32a886bcb2"}
{"time":1732626986.891314,"filepath":"/opt/microsoft/linuxmonagent/mdsd.err","log":"2024-11-26T13:16:26.8911260Z: Failed to upload to ODS: Error resolving address, Datatype: HEALTH_ASSESSMENT_BLOB, RequestId: c614a1ac-9543-4bfc-952a-695295fa6ed2"}
{"time":1732627046.885535,"filepath":"/opt/microsoft/linuxmonagent/mdsd.err","log":"2024-11-26T13:17:26.8851100Z: Failed to upload to ODS: Error resolving address, Datatype: HEALTH_ASSESSMENT_BLOB, RequestId: a78b6c28-2e7b-4ac2-b2af-094d3890dc0a"}
{"time":1732627106.896653,"filepath":"/opt/microsoft/linuxmonagent/mdsd.err","log":"2024-11-26T13:18:26.8963930Z: Failed to upload to ODS: Error resolving address, Datatype: HEALTH_ASSESSMENT_BLOB, RequestId: 59fb6d93-136c-4976-bf43-b1e5605b1cf7"}
{"time":1732627166.891286,"filepath":"/opt/microsoft/linuxmonagent/mdsd.err","log":"2024-11-26T13:19:26.8907940Z: Failed to upload to ODS: Error resolving address, Datatype: HEALTH_ASSESSMENT_BLOB, RequestId: 88666249-5613-4905-afdd-d295543fb8b6"}
{"time":1732627226.893481,"filepath":"/opt/microsoft/linuxmonagent/mdsd.err","log":"2024-11-26T13:20:26.8932170Z: Failed to upload to ODS: Error resolving address, Datatype: HEALTH_ASSESSMENT_BLOB, RequestId: 255e690b-b367-45f6-a737-fdb2237e7e54"}
{"time":1732627286.896024,"filepath":"/opt/microsoft/linuxmonagent/mdsd.err","log":"2024-11-26T13:21:26.8959120Z: Failed to upload to ODS: Error resolving address, Datatype: HEALTH_ASSESSMENT_BLOB, RequestId: 75bdd6bc-3ae9-4629-b228-2cd8f0b49e61"}
{"time":1732627346.892746,"filepath":"/opt/microsoft/linuxmonagent/mdsd.err","log":"2024-11-26T13:22:26.8922800Z: Failed to upload to ODS: Error resolving address, Datatype: HEALTH_ASSESSMENT_BLOB, RequestId: 057b6d66-216c-477b-836e-b1b1804db790"}
{"time":1732627406.904086,"filepath":"/opt/microsoft/linuxmonagent/mdsd.err","log":"2024-11-26T13:23:26.9033460Z: Failed to upload to ODS: Error resolving address, Datatype: HEALTH_ASSESSMENT_BLOB, RequestId: 6c822582-9560-4c14-b365-e0c4ddabc7f5"}
{"time":1732627466.881943,"filepath":"/opt/microsoft/linuxmonagent/mdsd.err","log":"2024-11-26T13:24:26.8816290Z: Failed to upload to ODS: Error resolving address, Datatype: HEALTH_ASSESSMENT_BLOB, RequestId: ab40ca78-92f8-4ba7-9d7a-5eb8aae7f153"}
{"time":1732627526.882745,"filepath":"/opt/microsoft/linuxmonagent/mdsd.err","log":"2024-11-26T13:25:26.8826000Z: Failed to upload to ODS: Error resolving address, Datatype: HEALTH_ASSESSMENT_BLOB, RequestId: afd2a892-7c0c-431d-9eb2-cfdd71bff1d5"}
{"time":1732627586.882771,"filepath":"/opt/microsoft/linuxmonagent/mdsd.err","log":"2024-11-26T13:26:26.8826290Z: Failed to upload to ODS: Error resolving address, Datatype: HEALTH_ASSESSMENT_BLOB, RequestId: 9a9df7ae-6358-4695-9c6b-7ce66a9ac241"}
{"time":1732627646.884296,"filepath":"/opt/microsoft/linuxmonagent/mdsd.err","log":"2024-11-26T13:27:26.8841080Z: Failed to upload to ODS: Error resolving address, Datatype: HEALTH_ASSESSMENT_BLOB, RequestId: 243803b0-c220-44c0-8fba-7901a3fdcf03"}
{"time":1732627706.88587,"filepath":"/opt/microsoft/linuxmonagent/mdsd.err","log":"2024-11-26T13:28:26.8856810Z: Failed to upload to ODS: Error resolving address, Datatype: HEALTH_ASSESSMENT_BLOB, RequestId: af5ea323-f897-4850-b298-05d43868c623"}
{"time":1732627766.88686,"filepath":"/opt/microsoft/linuxmonagent/mdsd.err","log":"2024-11-26T13:29:26.8867160Z: Failed to upload to ODS: Error resolving address, Datatype: HEALTH_ASSESSMENT_BLOB, RequestId: 37cb3428-54a1-4e17-a618-b446ee122e0d"}
{"time":1732627826.88448,"filepath":"/opt/microsoft/linuxmonagent/mdsd.err","log":"2024-11-26T13:30:26.8843210Z: Failed to upload to ODS: Error resolving address, Datatype: HEALTH_ASSESSMENT_BLOB, RequestId: 38fac5d4-586a-4bf9-84d9-e8e29e40e6e0"}
{"time":1732627886.890133,"filepath":"/opt/microsoft/linuxmonagent/mdsd.err","log":"2024-11-26T13:31:26.8899470Z: Failed to upload to ODS: Error resolving address, Datatype: HEALTH_ASSESSMENT_BLOB, RequestId: 1574933d-f768-480b-a89d-4e418b587b2e"}
{"time":1732627946.908764,"filepath":"/opt/microsoft/linuxmonagent/mdsd.err","log":"2024-11-26T13:32:26.9085420Z: Failed to upload to ODS: Error resolving address, Datatype: HEALTH_ASSESSMENT_BLOB, RequestId: 8f51f0ce-3412-4ec7-be5d-50f3692d67c1"}
{"time":1732628006.893424,"filepath":"/opt/microsoft/linuxmonagent/mdsd.err","log":"2024-11-26T13:33:26.8932040Z: Failed to upload to ODS: Error resolving address, Datatype: HEALTH_ASSESSMENT_BLOB, RequestId: 04e95588-44ea-47e3-ba0e-a626e89bb17c"}
{"time":1732628067.098458,"filepath":"/opt/microsoft/linuxmonagent/mdsd.err","log":"2024-11-26T13:34:27.0982250Z: Failed to upload to ODS: Error resolving address, Datatype: HEALTH_ASSESSMENT_BLOB, RequestId: 46932b58-919b-4ab4-8122-210345589565"}
{"time":1732628126.892672,"filepath":"/opt/microsoft/linuxmonagent/mdsd.err","log":"2024-11-26T13:35:26.8924610Z: Failed to upload to ODS: Error resolving address, Datatype: HEALTH_ASSESSMENT_BLOB, RequestId: e9333f3b-3d90-453b-9345-c1ec16d3b3a2"}
{"time":1732628186.885615,"filepath":"/opt/microsoft/linuxmonagent/mdsd.err","log":"2024-11-26T13:36:26.8854080Z: Failed to upload to ODS: Error resolving address, Datatype: HEALTH_ASSESSMENT_BLOB, RequestId: 52567d5b-ed00-4d43-a13d-9ab017670f37"}
{"time":1732628246.88451,"filepath":"/opt/microsoft/linuxmonagent/mdsd.err","log":"2024-11-26T13:37:26.8841520Z: Failed to upload to ODS: Error resolving address, Datatype: HEALTH_ASSESSMENT_BLOB, RequestId: 26878eb5-467a-42a3-a361-7c832aa11d46"}
{"time":1732628306.881584,"filepath":"/opt/microsoft/linuxmonagent/mdsd.err","log":"2024-11-26T13:38:26.8814430Z: Failed to upload to ODS: Error resolving address, Datatype: HEALTH_ASSESSMENT_BLOB, RequestId: 0364d09d-30f6-425b-b95c-b898e041b892"}
{"time":1732628366.884354,"filepath":"/opt/microsoft/linuxmonagent/mdsd.err","log":"2024-11-26T13:39:26.8840040Z: Failed to upload to ODS: Error resolving address, Datatype: HEALTH_ASSESSMENT_BLOB, RequestId: a65cfc48-8b74-450c-ac4d-d8665cfc5455"}
{"time":1732628426.884613,"filepath":"/opt/microsoft/linuxmonagent/mdsd.err","log":"2024-11-26T13:40:26.8844110Z: Failed to upload to ODS: Error resolving address, Datatype: HEALTH_ASSESSMENT_BLOB, RequestId: a229dd36-ea51-4004-b3cc-009994bba200"}
{"time":1732628486.884094,"filepath":"/opt/microsoft/linuxmonagent/mdsd.err","log":"2024-11-26T13:41:26.8839510Z: Failed to upload to ODS: Error resolving address, Datatype: HEALTH_ASSESSMENT_BLOB, RequestId: b1b5c751-9887-417d-a4cd-4a9c4040d72b"}
{"time":1732628546.885899,"filepath":"/opt/microsoft/linuxmonagent/mdsd.err","log":"2024-11-26T13:42:26.8856840Z: Failed to upload to ODS: Error resolving address, Datatype: HEALTH_ASSESSMENT_BLOB, RequestId: 32e2acf0-cbb5-4b55-9204-94a19682f7c4"}
{"time":1732628606.885875,"filepath":"/opt/microsoft/linuxmonagent/mdsd.err","log":"2024-11-26T13:43:26.8855100Z: Failed to upload to ODS: Error resolving address, Datatype: HEALTH_ASSESSMENT_BLOB, RequestId: 2c8e0319-0dc1-41a5-9232-3466ed0c9e11"}
{"time":1732628666.886256,"filepath":"/opt/microsoft/linuxmonagent/mdsd.err","log":"2024-11-26T13:44:26.8859960Z: Failed to upload to ODS: Error resolving address, Datatype: HEALTH_ASSESSMENT_BLOB, RequestId: e13e0ffc-13fe-4d2b-8c75-85281c3d45ce"}
{"time":1732628726.88345,"filepath":"/opt/microsoft/linuxmonagent/mdsd.err","log":"2024-11-26T13:45:26.8830640Z: Failed to upload to ODS: Error resolving address, Datatype: HEALTH_ASSESSMENT_BLOB, RequestId: db4cd111-0189-492a-8d79-b989e267fc15"}
{"time":1732628786.893921,"filepath":"/opt/microsoft/linuxmonagent/mdsd.err","log":"2024-11-26T13:46:26.8934210Z: Failed to upload to ODS: Error resolving address, Datatype: HEALTH_ASSESSMENT_BLOB, RequestId: 2e4de180-185d-45e5-883a-e9853d4eb48d"}
{"time":1732628846.893493,"filepath":"/opt/microsoft/linuxmonagent/mdsd.err","log":"2024-11-26T13:47:26.8930820Z: Failed to upload to ODS: Error resolving address, Datatype: HEALTH_ASSESSMENT_BLOB, RequestId: e4f9a871-4951-4e9d-99a2-8b5ee95bae11"}
{"time":1732628906.884589,"filepath":"/opt/microsoft/linuxmonagent/mdsd.err","log":"2024-11-26T13:48:26.8844090Z: Failed to upload to ODS: Error resolving address, Datatype: HEALTH_ASSESSMENT_BLOB, RequestId: 51d86416-2cf9-4808-9a56-86d7e3ec5fb2"}
{"time":1732628966.884893,"filepath":"/opt/microsoft/linuxmonagent/mdsd.err","log":"2024-11-26T13:49:26.8847280Z: Failed to upload to ODS: Error resolving address, Datatype: HEALTH_ASSESSMENT_BLOB, RequestId: 2f3f09b5-4d45-4611-a917-53cd6037d8ec"}
{"time":1732629026.886722,"filepath":"/opt/microsoft/linuxmonagent/mdsd.err","log":"2024-11-26T13:50:26.8864400Z: Failed to upload to ODS: Error resolving address, Datatype: HEALTH_ASSESSMENT_BLOB, RequestId: 49357380-eb49-4570-8ff5-ac5a88aff068"}
{"time":1732629086.889751,"filepath":"/opt/microsoft/linuxmonagent/mdsd.err","log":"2024-11-26T13:51:26.8895620Z: Failed to upload to ODS: Error resolving address, Datatype: HEALTH_ASSESSMENT_BLOB, RequestId: 9bf2e304-a001-4f1f-b57b-ccc8e4fa8df7"}
{"time":1732629146.883114,"filepath":"/opt/microsoft/linuxmonagent/mdsd.err","log":"2024-11-26T13:52:26.8829870Z: Failed to upload to ODS: Error resolving address, Datatype: HEALTH_ASSESSMENT_BLOB, RequestId: 48af3d37-ce4b-4141-9422-c742c335f91f"}
{"time":1732629206.886343,"filepath":"/opt/microsoft/linuxmonagent/mdsd.err","log":"2024-11-26T13:53:26.8861730Z: Failed to upload to ODS: Error resolving address, Datatype: HEALTH_ASSESSMENT_BLOB, RequestId: dcef7b6e-d828-44da-9ac2-4a63460e5b74"}
{"time":1732629266.886873,"filepath":"/opt/microsoft/linuxmonagent/mdsd.err","log":"2024-11-26T13:54:26.8865150Z: Failed to upload to ODS: Error resolving address, Datatype: HEALTH_ASSESSMENT_BLOB, RequestId: 5d62c4f6-9d94-4241-a9e8-6bb0c31c630a"}
{"time":1732629326.88633,"filepath":"/opt/microsoft/linuxmonagent/mdsd.err","log":"2024-11-26T13:55:26.8858990Z: Failed to upload to ODS: Error resolving address, Datatype: HEALTH_ASSESSMENT_BLOB, RequestId: 8529d7bb-c1c2-4be8-89b7-167c4c448dfa"}
{"time":1732629386.889344,"filepath":"/opt/microsoft/linuxmonagent/mdsd.err","log":"2024-11-26T13:56:26.8891490Z: Failed to upload to ODS: Error resolving address, Datatype: HEALTH_ASSESSMENT_BLOB, RequestId: ff9d3f02-4602-4b09-a0ce-81d6bfd02bdc"}
{"time":1732629446.891555,"filepath":"/opt/microsoft/linuxmonagent/mdsd.err","log":"2024-11-26T13:57:26.8913260Z: Failed to upload to ODS: Error resolving address, Datatype: HEALTH_ASSESSMENT_BLOB, RequestId: 4022edce-b635-49f0-b919-80eab2b4cd48"}
{"time":1732629506.893076,"filepath":"/opt/microsoft/linuxmonagent/mdsd.err","log":"2024-11-26T13:58:26.8917860Z: Failed to upload to ODS: Error resolving address, Datatype: HEALTH_ASSESSMENT_BLOB, RequestId: 33eaea97-2011-4215-900b-a98598500e07"}
{"time":1732629566.893134,"filepath":"/opt/microsoft/linuxmonagent/mdsd.err","log":"2024-11-26T13:59:26.8925730Z: Failed to upload to ODS: Error resolving address, Datatype: HEALTH_ASSESSMENT_BLOB, RequestId: 45bdf399-ceb8-4e92-941a-b0472c929ead"}
{"time":1732629626.894311,"filepath":"/opt/microsoft/linuxmonagent/mdsd.err","log":"2024-11-26T14:00:26.8938380Z: Failed to upload to ODS: Error resolving address, Datatype: HEALTH_ASSESSMENT_BLOB, RequestId: 0b85bf32-671a-41c9-b6a3-f9a7c22f04b8"}
{"time":1732629686.888637,"filepath":"/opt/microsoft/linuxmonagent/mdsd.err","log":"2024-11-26T14:01:26.8884580Z: Failed to upload to ODS: Error resolving address, Datatype: HEALTH_ASSESSMENT_BLOB, RequestId: 07fb7372-81cb-4817-988a-2cc291816d2a"}
{"time":1732629746.892878,"filepath":"/opt/microsoft/linuxmonagent/mdsd.err","log":"2024-11-26T14:02:26.8926450Z: Failed to upload to ODS: Error resolving address, Datatype: HEALTH_ASSESSMENT_BLOB, RequestId: ed54fe22-23f9-47a5-a588-f81b5f29ad65"}
{"time":1732629806.906515,"filepath":"/opt/microsoft/linuxmonagent/mdsd.err","log":"2024-11-26T14:03:26.9063930Z: Failed to upload to ODS: Error resolving address, Datatype: HEALTH_ASSESSMENT_BLOB, RequestId: 62c61e3d-bf23-42a7-a2a6-904a14b82f26"}
{"time":1732629866.882074,"filepath":"/opt/microsoft/linuxmonagent/mdsd.err","log":"2024-11-26T14:04:26.8817570Z: Failed to upload to ODS: Error resolving address, Datatype: HEALTH_ASSESSMENT_BLOB, RequestId: 7d3264c5-56f9-46de-9d58-e0533d414ef1"}
{"time":1732629926.885382,"filepath":"/opt/microsoft/linuxmonagent/mdsd.err","log":"2024-11-26T14:05:26.8851620Z: Failed to upload to ODS: Error resolving address, Datatype: HEALTH_ASSESSMENT_BLOB, RequestId: b28baeb8-af61-4a6a-b151-b97f48dab5d4"}
{"time":1732629986.884085,"filepath":"/opt/microsoft/linuxmonagent/mdsd.err","log":"2024-11-26T14:06:26.8839470Z: Failed to upload to ODS: Error resolving address, Datatype: HEALTH_ASSESSMENT_BLOB, RequestId: cbcaee2d-9107-422c-a014-2df4d60ba629"}

Describe pod output:

Name:                 ama-metrics-node-xdkbb
Namespace:            kube-system
Priority:             2000001000
Priority Class Name:  system-node-critical
Service Account:      ama-metrics-serviceaccount
Node:                 aks-linux-05026455-vmss00003c/10.210.0.10
Start Time:           Sat, 09 Nov 2024 02:26:29 +0100
Labels:               controller-revision-hash=577c599c9
                      dsName=ama-metrics-node
                      kubernetes.azure.com/managedby=aks
                      pod-template-generation=6
Annotations:          agentVersion: 0.0.0.1
                      schema-versions: v1
Status:               Running
IP:                   10.210.0.152
IPs:
  IP:           10.210.0.152
Controlled By:  DaemonSet/ama-metrics-node
Containers:
  prometheus-collector:
    Container ID:   containerd://2a8c4d7976e79696c3d6ee74f7ec578343556a89f81fde2adf854959082a967a
    Image:          mcr.microsoft.com/azuremonitor/containerinsights/ciprod/prometheus-collector/images:6.10.0-main-09-16-2024-85a71678
    Image ID:       sha256:81ba48e19ff47fba73c9485ec60b28888563f769b6399cc74dc9a79f931bf5c3
    Port:           <none>
    Host Port:      <none>
    State:          Running
      Started:      Tue, 26 Nov 2024 14:04:08 +0100
    Last State:     Terminated
      Reason:       Error
      Exit Code:    2
      Started:      Tue, 19 Nov 2024 16:15:37 +0100
      Finished:     Tue, 26 Nov 2024 14:04:07 +0100
    Ready:          True
    Restart Count:  2
    Limits:
      cpu:     200m
      memory:  1Gi
    Requests:
      cpu:     50m
      memory:  150Mi
    Liveness:  http-get http://:8080/health delay=60s timeout=5s period=15s #success=1 #failure=3
    Environment:
      KUBERNETES_SERVICE_HOST:                     xyz.hcp.eastus.azmk8s.io
      KUBERNETES_PORT:                             tcp://xyz.hcp.eastus.azmk8s.io:443
      KUBERNETES_PORT_443_TCP:                     tcp://xyz.hcp.eastus.azmk8s.io:443
      KUBERNETES_PORT_443_TCP_ADDR:                xyz.hcp.eastus.azmk8s.io
      CLUSTER:                                     /subscriptions/a1987720-3efb-464b-83e4-ea8aa76e2873/resourceGroups/dev-xyz-eastus-rg/providers/Microsoft.ContainerService/managedClusters/dev-xyz-xyz-cluster
      AKSREGION:                                   eastus
      MAC:                                         true
      AZMON_COLLECT_ENV:                           false
      customEnvironment:                           azurepubliccloud
      OMS_TLD:                                     opinsights.azure.com
      CONTROLLER_TYPE:                             DaemonSet
      NODE_IP:                                      (v1:status.hostIP)
      NODE_NAME:                                    (v1:spec.nodeName)
      POD_NAME:                                    ama-metrics-node-xdkbb (v1:metadata.name)
      POD_NAMESPACE:                               kube-system (v1:metadata.namespace)
      CONTAINER_CPU_LIMIT:                         200 (limits.cpu)
      CONTAINER_MEMORY_LIMIT:                      1024 (limits.memory)
      KUBE_STATE_NAME:                             ama-metrics-ksm
      NODE_EXPORTER_NAME:                          
      NODE_EXPORTER_TARGETPORT:                    19100
      KUBE_STATE_VERSION:                          mcr.microsoft.com/oss/kubernetes/kube-state-metrics:v2.9.2
      NODE_EXPORTER_VERSION:                       
      AGENT_VERSION:                               6.10.0-main-09-16-2024-85a71678
      MODE:                                        advanced
      WINMODE:                                     advanced
      MINIMAL_INGESTION_PROFILE:                   true
      APPMONITORING_AUTOINSTRUMENTATION_ENABLED:   false
      APPMONITORING_OPENTELEMETRYMETRICS_ENABLED:  false
      APPMONITORING_OPENTELEMETRYMETRICS_PORT:     28333
    Mounts:
      /anchors/mariner from anchors-mariner (ro)
      /anchors/ubuntu from anchors-ubuntu (ro)
      /etc/config/settings from settings-vol-config (ro)
      /etc/config/settings/prometheus from prometheus-config-vol (ro)
      /etc/prometheus/certs from ama-metrics-tls-secret-volume (ro)
      /var/log/containers from host-log-containers (ro)
      /var/log/pods from host-log-pods (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-fggfm (ro)
  addon-token-adapter:
    Container ID:  containerd://e54b5839c7d6d2a7cde7a032f9c2ea39360d4ccaf66c53b65ae5caf0af8b0df3
    Image:         mcr.microsoft.com/aks/msi/addon-token-adapter:master.240912.1
    Image ID:      mcr.microsoft.com/aks/msi/addon-token-adapter@sha256:7749ede5c74bf82b1481d23a156e93b199d0649ac7ca6f055a79b048af3bccfc
    Port:          <none>
    Host Port:     <none>
    Command:
      /addon-token-adapter
    Args:
      --secret-namespace=kube-system
      --secret-name=aad-msi-auth-token
      --token-server-listening-port=7777
      --health-server-listening-port=9999
      --restart-pod-waiting-minutes-on-broken-connection=240
    State:          Running
      Started:      Sat, 09 Nov 2024 02:27:00 +0100
    Ready:          True
    Restart Count:  0
    Limits:
      cpu:     500m
      memory:  500Mi
    Requests:
      cpu:     20m
      memory:  30Mi
    Liveness:  http-get http://:9999/healthz delay=10s timeout=1s period=60s #success=1 #failure=3
    Environment:
      KUBERNETES_SERVICE_HOST:       xyz.hcp.eastus.azmk8s.io
      KUBERNETES_PORT:               tcp://xyz.hcp.eastus.azmk8s.io:443
      KUBERNETES_PORT_443_TCP:       tcp://xyz.hcp.eastus.azmk8s.io:443
      KUBERNETES_PORT_443_TCP_ADDR:  xyz.hcp.eastus.azmk8s.io
      AZMON_COLLECT_ENV:             false
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-fggfm (ro)
Conditions:
  Type                        Status
  PodReadyToStartContainers   True 
  Initialized                 True 
  Ready                       True 
  ContainersReady             True 
  PodScheduled                True 
Volumes:
  settings-vol-config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      ama-metrics-settings-configmap
    Optional:  true
  ama-metrics-tls-secret-volume:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  ama-metrics-mtls-secret
    Optional:    true
  prometheus-config-vol:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      ama-metrics-prometheus-config-node
    Optional:  true
  host-log-containers:
    Type:          HostPath (bare host directory volume)
    Path:          /var/log/containers
    HostPathType:  
  host-log-pods:
    Type:          HostPath (bare host directory volume)
    Path:          /var/log/pods
    HostPathType:  
  anchors-mariner:
    Type:          HostPath (bare host directory volume)
    Path:          /etc/pki/ca-trust/anchors/
    HostPathType:  DirectoryOrCreate
  anchors-ubuntu:
    Type:          HostPath (bare host directory volume)
    Path:          /usr/local/share/ca-certificates/
    HostPathType:  DirectoryOrCreate
  kube-api-access-fggfm:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              <none>
Tolerations:                 :NoExecute op=Exists
                             :NoSchedule op=Exists
                             CriticalAddonsOnly op=Exists
Events:                      <none>

@PAKalucki PAKalucki added the bug label Nov 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant