Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[kube-prometheus-stack] : Support for GKE Autopilot #4833

Open
nistiwar opened this issue Sep 4, 2024 · 3 comments
Open

[kube-prometheus-stack] : Support for GKE Autopilot #4833

nistiwar opened this issue Sep 4, 2024 · 3 comments
Labels
enhancement New feature or request lifecycle/stale

Comments

@nistiwar
Copy link

nistiwar commented Sep 4, 2024

Is your feature request related to a problem ?

I am trying to install kube-prometheus-stack helm chart on a GKE Autopilot cluster with Allowlisted workloads, not successful. Error details:
$ helm install kube-prometheus-stack . -n monitoring Error: INSTALLATION FAILED: 6 errors occurred: * services is forbidden: User "nishith*****.com" cannot create resource "services" in API group "" in the namespace "kube-system": GKEAutopilot authz: the namespace "kube-system" is managed and the request's verb "create" is denied * services is forbidden: User "nishith*****.com" cannot create resource "services" in API group "" in the namespace "kube-system": GKEAutopilot authz: the namespace "kube-system" is managed and the request's verb "create" is denied * services is forbidden: User "nishith*****.com" cannot create resource "services" in API group "" in the namespace "kube-system": GKEAutopilot authz: the namespace "kube-system" is managed and the request's verb "create" is denied * services is forbidden: User "nishith*****.com" cannot create resource "services" in API group "" in the namespace "kube-system": GKEAutopilot authz: the namespace "kube-system" is managed and the request's verb "create" is denied * services is forbidden: User "nishith*****.com" cannot create resource "services" in API group "" in the namespace "kube-system": GKEAutopilot authz: the namespace "kube-system" is managed and the request's verb "create" is denied * admission webhook "gkepolicy.common-webhooks.networking.gke.io" denied the request: GKE Policy Controller rejected the request because it violates one or more policies: {"[denied by autogke-disallow-hostnamespaces]":["enabling hostPID is not allowed in Autopilot. Requested by user: 'nishith*************.com', groups: 'system:authenticated'.","enabling hostNetwork is not allowed in Autopilot. Requested by user: 'nishith*************.com', groups: 'system:authenticated'."],"[denied by autogke-no-host-port]":["container node-exporter specifies a host port; disallowed in Autopilot. Requested by user: 'nishith*************.com', groups: 'system:authenticated'."],"[denied by gec-hostpath]":["hostPath volume proc used in container node-exporter uses path /proc in read mode which is not allowed. Allowed read path prefixes for hostPath volumes are: [/dev/hugepages /dev/infiniband /dev/vfio /dev/char /sys/devices]. Requested by user: 'nishith*************.com', groups: 'system:authenticated'.","hostPath volume sys used in container node-exporter uses path /sys in read mode which is not allowed. Allowed read path prefixes for hostPath volumes are: [/dev/hugepages /dev/infiniband /dev/vfio /dev/char /sys/devices]. Requested by user: 'nishith*************.com', groups: 'system:authenticated'.","hostPath volume root used in container node-exporter uses path / in read mode which is not allowed. Allowed read path prefixes for hostPath volumes are: [/dev/hugepages /dev/infiniband /dev/vfio /dev/char /sys/devices]. Requested by user: 'nishith*************.com', groups: 'system:authenticated'."]}
node-exporter-ds.txt
Allowlistedworkloads.txt

Describe the solution you'd like.

There should be-
1- Documented way to allow node-exporter to be able to deploy on a GKE Autopilot cluster
2- The services created in kube-system namespace should be avoided and still we should be able to scrape components like scheduler/kubelet etc.
3- There should be a node label to metrics collected by node-exporter, along with instance

Describe alternatives you've considered.

Have tried:
1- Disabling scheduler,dns,kubelet, etc components.
2- Tweaking Node-exporter to comply with GKE Autopilot, i.e. remove permissions from node-exporter which are restricted by GKE Autopilot.
3- Tried to use Allowlistedworkload CRD to escalate priveleges for node-exporter.

1 and 2 worked, but left us with limited metrics and most of the panels not working in dashboards. 3, should allow node-exporter to run with elevated privileges in GKE Autopilot, but I was not successful implementing it.

Additional context.

No response

@nistiwar nistiwar added the enhancement New feature or request label Sep 4, 2024
@philippemnoel
Copy link

Any luck? I'm facing the same issue

@trejo08
Copy link

trejo08 commented Dec 27, 2024

Any luck? I'm facing the same issue

@philippemnoel I've achieved installation disabling some components since are in kube-system and its protected by gke, so in this configuration you can get usage metrics for your workloads by doing this:

    kubeProxy:
      enabled: false
    kubeScheduler:
      enabled: false
    kubeControllerManager:
      enabled: false
    kubeEtcd:
      enabled: false
    kubeStateMetrics:
      enabled: true
    coreDns:
      enabled: false
    prometheus:
      prometheusSpec:
        additionalScrapeConfigs:
          - job_name: 'kubernetes-pods-cadvisor'
            scheme: https
            metrics_path: /metrics/cadvisor
            kubernetes_sd_configs:
              - role: node
            bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
            tls_config:
              insecure_skip_verify: true
            relabel_configs:
              - action: labelmap
                regex: __meta_kubernetes_node_label_(.+)

Copy link

stale bot commented Feb 1, 2025

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions.

@stale stale bot added the lifecycle/stale label Feb 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request lifecycle/stale
Projects
None yet
Development

No branches or pull requests

3 participants