[Bug] Cost analyzer pod in CrashLoopBackOff after enabling the readonly feature #3433

mariojuarezc · 2024-05-21T22:53:04Z

Kubecost Helm Chart Version

2.2.5

Kubernetes Version

1.27

Kubernetes Platform

AKS

Description

I installed kubecost using the helm chart, enabling the readonly feature, and the aggregator container in the kubecost-cost-analyzer pod started failing leaving the pod in CrashLoopBackOff status making kubecost UI inaccessible.

Steps to reproduce

Install kubecost enabling the readonly feature

helm install kubecost cost-analyzer \
--repo https://kubecost.github.io/cost-analyzer/ \
--namespace kubecost --create-namespace \
--set kubecostToken="bWFyaW9qdWFyZXpjQGdtYWlsLmNvbQ==xm343yadf98" \
--set readonly=true

Watch the aggregator logs (for the below command, the pod name should be different)
kubectl -n kubecost logs kubecost-cost-analyzer-5dcbc54c48-hd2s7 -c aggregator -f
You will get the following error: Error: listen tcp :9004: bind: address already in use
Watch the kubecost pods and you will see the kubecost-cost-analyzer pod in CrashLoopBackOff status with many restarts

kubectl -n kubecost get pods -w
NAME                                          READY   STATUS             RESTARTS       AGE
kubecost-cost-analyzer-5dcbc54c48-hd2s7       3/4     CrashLoopBackOff   7 (5m1s ago)   16m
kubecost-forecasting-86c455686d-bpj2s         1/1     Running            0              16m
kubecost-grafana-8d47b4c64-klzw8              2/2     Running            0              16m
kubecost-prometheus-server-7474d45899-ch7xq   1/1     Running            0              16m

Expected behavior

It is expected to kubecost run as usual but disabling updates to kubecost from the frontend UI and via POST request as described in the values.yaml file

cost-analyzer-helm-chart/cost-analyzer/values.yaml

Line 3156 in c11a9bb

# readonly: false

Impact

No response

Screenshots

No response

Logs

aggregator container logs in the kubecost-cost-analyzer pod

kubectl -n kubecost logs kubecost-cost-analyzer-5dcbc54c48-hd2s7  -c aggregator -f

2024/05/21 22:24:18 maxprocs: Leaving GOMAXPROCS=8: CPU quota undefined
2024-05-21T22:24:18.321191206Z ??? Log level set to info
2024-05-21T22:24:18.321223005Z INF tracing disabled
2024-05-21T22:24:18.34644253Z INF Starting Kubecost Aggregator version kcm-0f623d1ed0_core-c3cb2218df_oc-67e81e89ca (0f623d1e)
WARN: bun: 2024/05/21 22:24:18 query "TRUE" has [] args, but no placeholders
2024-05-21T22:24:18.347090025Z INF NAMESPACE: kubecost-readonly
WARN: bun: 2024/05/21 22:24:18 query "TRUE" has [] args, but no placeholders
2024-05-21T22:24:18.490151229Z INF Using default file store as data source
2024-05-21T22:24:18.490302428Z ERR entering state: run_ingestor, err: error creating static tables: %!s(<nil>)
2024-05-21T22:24:18.490329227Z ERR after event, current state: run_ingestor, err: error creating static tables: %!s(<nil>)
2024-05-21T22:24:18.490345727Z ERR error submitting event: error creating static tables: %!s(<nil>)
2024-05-21T22:24:18.490382627Z INF Thanos Pipeline: Stopped
2024-05-21T22:24:18.490418827Z INF NetworkInsight: Ingestor: Stopped
2024-05-21T22:24:18.490402527Z INF CloudCost: Ingestor: Stopped
2024-05-21T22:24:18.490455027Z INF Asset: Ingestor: Stopped
2024-05-21T22:24:18.490448227Z INF CustomCost: Ingestor: Stopped
2024-05-21T22:24:18.490447627Z INF AllocationIngestor: Stopped
WARN: bun: 2024/05/21 22:24:18 query "TRUE" has [] args, but no placeholders
2024-05-21T22:24:18.548086725Z INF Done waiting
2024-05-21T22:24:18.548641121Z INF Starting *v1.Namespace controller
2024-05-21T22:24:18.548904019Z INF Starting *v1.Node controller
2024-05-21T22:24:18.549084318Z INF Starting *v1.Pod controller
2024-05-21T22:24:18.549198017Z INF Starting *v1.Deployment controller
2024-05-21T22:24:18.549222117Z INF Starting *v1.DaemonSet controller
2024-05-21T22:24:18.549286117Z INF Starting *v1.StatefulSet controller
2024-05-21T22:24:18.549333116Z INF Starting *v1.Job controller
2024-05-21T22:24:18.549333116Z INF Starting *v1.Service controller
2024-05-21T22:24:18.549378216Z INF Starting *v1.PersistentVolume controller
2024-05-21T22:24:18.549415316Z INF Starting *v1.ConfigMap controller
2024-05-21T22:24:18.549424216Z INF Starting *v1.PersistentVolumeClaim controller
2024-05-21T22:24:18.549462616Z INF Starting *v1.StorageClass controller
2024-05-21T22:24:18.549470415Z INF Starting *v1.ReplicationController controller
2024-05-21T22:24:18.549159018Z INF Starting *v1.ReplicaSet controller
2024-05-21T22:24:18.549503315Z INF Starting *v1.PodDisruptionBudget controller
2024-05-21T22:24:18.553382488Z INF No product-configs configmap found at install time, using existing configs: configmaps "product-configs" not found
2024-05-21T22:24:18.558352854Z INF No saved-report-configs configmap found at install time, using existing configs: configmaps "saved-report-configs" not found
2024-05-21T22:24:18.56322332Z INF No asset-report-configs configmap found at install time, using existing configs: configmaps "asset-report-configs" not found
2024-05-21T22:24:18.606345719Z INF Skipping derivation because there is no new data to derive
2024-05-21T22:24:18.623271401Z ERR savings: cluster sizing: failed to get cluster properties: could not get properties for any cluster: 
2024-05-21T22:24:18.623320701Z WRN got error failed to get cluster properties: could not get properties for any cluster:  for metric clusterSizing%%Development, not adding to cache
WARN: bun: 2024/05/21 22:24:18 query "TRUE" has [] args, but no placeholders
WARN: bun: 2024/05/21 22:24:18 query "TRUE" has [] args, but no placeholders
WARN: bun: 2024/05/21 22:24:18 query "TRUE" has [] args, but no placeholders
2024-05-21T22:24:18.713741071Z ERR savings: cluster sizing: failed to get cluster properties: could not get properties for any cluster: 
2024-05-21T22:24:18.713785871Z WRN got error failed to get cluster properties: could not get properties for any cluster:  for metric clusterSizing%%Production, not adding to cache
WARN: bun: 2024/05/21 22:24:18 query "TRUE" has [] args, but no placeholders
WARN: bun: 2024/05/21 22:24:18 query "TRUE" has [] args, but no placeholders
2024-05-21T22:24:18.75399779Z INF No cloud-cost-report-configs configmap found at install time, using existing configs: configmaps "cloud-cost-report-configs" not found
WARN: bun: 2024/05/21 22:24:18 query "TRUE" has [] args, but no placeholders
2024-05-21T22:24:18.802644649Z ERR savings: cluster sizing: failed to get cluster properties: could not get properties for any cluster: 
2024-05-21T22:24:18.802693149Z WRN got error failed to get cluster properties: could not get properties for any cluster:  for metric clusterSizing%%High-Availability, not adding to cache
2024-05-21T22:24:18.953107695Z INF No recurring-budget-rule-configs configmap found at install time, using existing configs: configmaps "recurring-budget-rule-configs" not found
2024-05-21T22:24:19.153687489Z INF No budget-configs configmap found at install time, using existing configs: configmaps "budget-configs" not found
2024-05-21T22:24:19.353348391Z INF No account-mapping configmap found at install time, using existing configs: configmaps "account-mapping" not found
2024-05-21T22:24:19.554261083Z INF No group-filters configmap found at install time, using existing configs: configmaps "group-filters" not found
Error: listen tcp :9004: bind: address already in use

Slack discussion

No response

Troubleshooting

I have read and followed the issue guidelines and this is a bug impacting only the Helm chart.
I have searched other issues in this repository and mine is not recorded.

The text was updated successfully, but these errors were encountered:

chipzoller · 2024-05-22T13:35:38Z

Confirmed on 2.2.5.

chipzoller · 2024-05-22T13:37:34Z

cc @jessegoodier

jessegoodier · 2024-05-22T15:25:53Z

Thanks-
triage:
readOnly does work on statefulset deployMethod, not singlePod. Will log this. I can't commit to timing.

kubecostAggregator:
  deployMethod: statefulset

internal issue: https://kubecost.atlassian.net/browse/BURNDOWN-434

mariojuarezc added bug Something isn't working needs-triage labels May 21, 2024

chipzoller removed the needs-triage label May 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] Cost analyzer pod in CrashLoopBackOff after enabling the readonly feature #3433

[Bug] Cost analyzer pod in CrashLoopBackOff after enabling the readonly feature #3433

mariojuarezc commented May 21, 2024

chipzoller commented May 22, 2024

chipzoller commented May 22, 2024

jessegoodier commented May 22, 2024 •

edited

Loading

[Bug] Cost analyzer pod in CrashLoopBackOff after enabling the readonly feature #3433

[Bug] Cost analyzer pod in CrashLoopBackOff after enabling the readonly feature #3433

Comments

mariojuarezc commented May 21, 2024

Kubecost Helm Chart Version

Kubernetes Version

Kubernetes Platform

Description

Steps to reproduce

Expected behavior

Impact

Screenshots

Logs

Slack discussion

Troubleshooting

chipzoller commented May 22, 2024

chipzoller commented May 22, 2024

jessegoodier commented May 22, 2024 • edited Loading

jessegoodier commented May 22, 2024 •

edited

Loading