Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Linkerd CNI pods not aware about the OIDC signing key auto-rotation by AKS| #12573

Closed
Peeyush1989 opened this issue May 8, 2024 · 8 comments · Fixed by linkerd/linkerd2-proxy-init#440
Labels
bug env/aks Microsoft AKS

Comments

@Peeyush1989
Copy link

What is the issue?

We are using a private AKS cluster version 1.26.x, We have configured linkerd stable version 2.14.2 with linkerd-cni enabled.

The AKS cluster is enabled with OIDC which is designed to to auto rotate the signing keys periodically.

After the OIDC keys were auto rotated, all the new pods were getting stuck with following error

“FailedCreatePodSandBox (x556 over ) kubelet (combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "3756782430d4016076288c700b871e4325ca2d5d6bdd7a422697c7d3b54d23e6": plugin type="linkerd-cni" name="linkerd-cni" failed (add): Unauthorized”

  • We found that the issue started after an automatic RotateServiceAccountSigningKeys operation
  • We tried reconciling the cluster, by running a “az aks update” command, but the issue persisted.
  • we tried to create a new token for the default service account in the default namespace, then created a new pod with it. but the issue persisted.
  • Then, we tried running “az aks oidc-issuer rotate-signing-keys” twice, but the issue persisted.
  • Lastly, we figured that since the new pods are failing with an unauthorized linkerd error, that would mean that the issue is being generated in the linkerd pods. Therefore, we deleted the linkerd-cni daemonset pods, which caused the new pods to get the fresh token, which caused the issue to get resolved.

After restarted the linkerd-cni daemonset were were able to deploy the new pods but the existing pods in the linkerd meshed namespace started giving invalid certificate errors and pods inter communication was impacted.

We checked the issuer certificate and it was valid. We had to redeploy linkerd to get rid of this issue

Need to you help in troubleshooting linkerd issues with OIDC

How can it be reproduced?

we need to manual auto rotated the oidc signing keys in new infra to reproduce this issues.__

Logs, error output, etc

Linkerd control plane

[ 0.105506s] WARN ThreadId(01) watch{port=8086}:controller{addr=localhost:8090}:endpoint{addr=127.0.0.1:8090}: linkerd_reconnect: Failed to connect error=endpoint 127.0.0.1:8090: Connection refused (os error 111) error.sources=[Connection refused (os error 111)]
[ 0.306969s] WARN ThreadId(01) watch{port=8086}:controller{addr=localhost:8090}:endpoint{addr=127.0.0.1:8090}: linkerd_reconnect: Failed to connect error=endpoint 127.0.0.1:8090: Connection refused (os error 111) error.sources=[Connection refused (os error 111)]
[ 0.710647s] WARN ThreadId(01) watch{port=8086}:controller{addr=localhost:8090}:endpoint{addr=127.0.0.1:8090}: linkerd_reconnect: Failed to connect error=endpoint 127.0.0.1:8090: Connection refused (os error 111) error.sources=[Connection refused (os error 111)]
[ 1.211775s] WARN ThreadId(01) watch{port=8086}:controller{addr=localhost:8090}:endpoint{addr=127.0.0.1:8090}: linkerd_reconnect: Failed to connect error=endpoint 127.0.0.1:8090: Connection refused (os error 111) error.sources=[Connection refused (os error 111)]
[ 1.713047s] WARN ThreadId(01) watch{port=8086}:controller{addr=localhost:8090}:endpoint{addr=127.0.0.1:8090}: linkerd_reconnect: Failed to connect error=endpoint 127.0.0.1:8090: Connection refused (os error 111) error.sources=[Connection refused (os error 111)]
[ 2.215585s] WARN ThreadId(01) watch{port=8086}:controller{addr=localhost:8090}:endpoint{addr=127.0.0.1:8090}: linkerd_reconnect: Failed to connect error=endpoint 127.0.0.1:8090: Connection refused (os error 111) error.sources=[Connection refused (os error 111)]
[ 2.716391s] WARN ThreadId(01) watch{port=8086}:controller{addr=localhost:8090}:endpoint{addr=127.0.0.1:8090}: linkerd_reconnect: Failed to connect error=endpoint 127.0.0.1:8090: Connection refused (os error 111) error.sources=[Connection refused (os error 111)]
[ 3.217705s] WARN ThreadId(01) watch{port=8086}:controller{addr=localhost:8090}:endpoint{addr=127.0.0.1:8090}: linkerd_reconnect: Failed to connect error=endpoint 127.0.0.1:8090: Connection refused (os error 111) error.sources=[Connection refused (os error 111)]

output of linkerd check -o short

N/A

Environment

  • K8 version 1.26
  • Env: AKS

Possible solution

No response

Additional context

No response

Would you like to work on fixing this bug?

yes

@Peeyush1989 Peeyush1989 added the bug label May 8, 2024
@olix0r olix0r added the env/aks Microsoft AKS label May 16, 2024
Copy link

stale bot commented Aug 17, 2024

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix label Aug 17, 2024
@rootik
Copy link

rootik commented Aug 20, 2024

We are experiencing the same issue on AKS 1.30.3 running linkerd stable-2.14.9.
The linkerd-cni tokens have started to expire every hour resulting new pods are not able to start with the error mentioned above.
We've found out that Istio had the same issue and it was addressed finally.
We've applied a temporary workaround to restart the linkerd CNI daemon set every 50 minutes.
There's probably a better workaround but I agree it's a bug and it has to be fixed.
It turned out that a workaround would be to create a daemon set which would mount /host/etc/cni/net.d/ as a volume and then do the simplest thing sed -i 's/info/warn/g' /host/etc/cni/net.d/10-azure.conflist every 50 minutes and then sleep.
Here's the result of POC:

[2024-08-21 14:37:42] Detected change in /host/etc/cni/net.d/: MOVED_TO 10-azure.conflist
[2024-08-21 14:37:42] New file [10-azure.conflist] detected; re-installing

@stale stale bot removed the wontfix label Aug 20, 2024
@linkessgit
Copy link

linkessgit commented Aug 23, 2024

We also have this issue on AKS clusters upgraded to Kubernetes version >=1.30 and OIDC feature enabled. We contacted Microsoft and they confirmed that the service account token lifetime for such clusters is set to 1-h expiration, they also said that, the 1-year token behavior is legacy, and 1-h token behavior is recommended and will applied to all clusters even without OIDC feature enabled.
The problem as described by @rootik and @Peeyush1989 that install-cni pod does not monitor service account token changes so it's not reflected in cni config and causing
Unknown desc = failed to setup network for sandbox "3756782430d4016076288c700b871e4325ca2d5d6bdd7a422697c7d3b54d23e6": plugin type="linkerd-cni" name="linkerd-cni" failed (add): Unauthorized”

@amedinagar
Copy link

Experiencing same issue on AKS 1.30.3 running linkerd stable-2.14.9.
install-cni pod does not monitor service account token changes, and it causes Uknown desc error.

@rootik
Copy link

rootik commented Sep 5, 2024

A pull request for the possible fix was raised linkerd/linkerd2-proxy-init#416

@oskarm93
Copy link

Reproduced on linkerd edge-24.8.2 (stable-2.16) and AKS 1.30.4.

Added a daemonset with busybox running a script from this suggestion:

It turned out that a workaround would be to create a daemon set which would mount /host/etc/cni/net.d/ as a volume and then do the simplest thing sed -i 's/info/warn/g' /host/etc/cni/net.d/10-azure.conflist every 50 minutes and then sleep. Here's the result of POC:

[2024-08-21 14:37:42] Detected change in /host/etc/cni/net.d/: MOVED_TO 10-azure.conflist
[2024-08-21 14:37:42] New file [10-azure.conflist] detected; re-installing

https://gist.github.com/oskarm93/a6941dafc0a2af52f35af794c939f20f
https://gist.github.com/oskarm93/0f7a901f5ec5db5ae4e00e42176c98a4

[2024-11-13 10:22:12] Created CNI config /host/etc/cni/net.d/99-cni-fix.conflist
Setting up watches.
Watches established.
[2024-11-13 10:28:10] Detected change in /host/etc/cni/net.d/: MODIFY 99-cni-fix.conflist
[2024-11-13 10:28:10] New file [99-cni-fix.conflist] detected; re-installing
[2024-11-13 10:28:10] Using CNI config template from CNI_NETWORK_CONFIG environment variable.
      "k8s_api_root": "https://__KUBERNETES_SERVICE_HOST__:__KUBERNETES_SERVICE_PORT__",
      "k8s_api_root": "https://10.0.0.1:__KUBERNETES_SERVICE_PORT__",
[2024-11-13 10:28:10] CNI config: {
  "name": "linkerd-cni",
  "type": "linkerd-cni",
  "log_level": "info",
  "policy": {
      "type": "k8s",
      "k8s_api_root": "https://10.0.0.1:443",
      "k8s_auth_token": "__SERVICEACCOUNT_TOKEN__"
  },
  "kubernetes": {
      "kubeconfig": "/etc/cni/net.d/ZZZ-linkerd-cni-kubeconfig"
  },
  "linkerd": {
    "incoming-proxy-port": 4143,
    "outgoing-proxy-port": 4140,
    "proxy-uid": 2102,
    "ports-to-redirect": [],
    "inbound-ports-to-ignore": ["4191","4190"],
    "simulate": false,
    "use-wait-flag": false,
    "iptables-mode": "legacy",
    "ipv6": false
  }
}
[2024-11-13 10:28:10] Created CNI config /host/etc/cni/net.d/99-cni-fix.conflist
[2024-11-13 10:28:10] Detected change in /host/etc/cni/net.d/: MODIFY 99-cni-fix.conflist
[2024-11-13 10:28:10] Ignoring event: MODIFY /host/etc/cni/net.d/99-cni-fix.conflist; no real changes detected
[2024-11-13 10:28:10] Detected change in /host/etc/cni/net.d/: DELETE 99-cni-fix.conflist
[2024-11-13 10:28:10] Detected change in /host/etc/cni/net.d/: CREATE 99-cni-fix.conflist
[2024-11-13 10:28:10] Ignoring event: CREATE /host/etc/cni/net.d/99-cni-fix.conflist; no real changes detected
[2024-11-13 10:28:10] Detected change in /host/etc/cni/net.d/: MODIFY 99-cni-fix.conflist
[2024-11-13 10:28:10] Ignoring event: MODIFY /host/etc/cni/net.d/99-cni-fix.conflist; no real changes detected

alpeb added a commit that referenced this issue Nov 29, 2024
This change removes the `policy` entry from the cni config template,
which isn't used. That contained a `__SERVICEACCOUNT_TOKEN__`
placeholder, which was coupling this config file with the
`ZZZ-linkerd-cni-kubeconfig` file generated by linkerd-cni. An upcoming
PR will add support for detecting changes in the mounted serviceaccount
token file (see #12573), and the current change will facilitate that
effort.
alpeb added a commit to linkerd/linkerd2-proxy-init that referenced this issue Nov 29, 2024
Fixes linkerd/linkerd2#12573

## Problem

When deployed, the linkerd-cni pod gets its service account token mounted automatically by k8s:
```yaml
  - name: kube-api-access-729gv
    projected:
      defaultMode: 420
      sources:
      - serviceAccountToken:
          expirationSeconds: 3607
          path: token
      - configMap:
          items:
          - key: ca.crt
            path: ca.crt
          name: kube-root-ca.crt
      - downwardAPI:
          items:
          - fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
            path: namespace
```
According to this, the token is set to expire after an hour.
When the linkerd-cni pod starts it deploys the file `ZZZ-linkerd-cni-kubeconfig` in to the **host** file system.
That config contains the token sourced from `/var/run/secrets/kubernetes.io/serviceaccount` (mounted by the pod).
When the token gets rotated after an hour, that token file is updated but `ZZZ-linkerd-cni-kubeconfig` is not updated.
The `linkerd-cni` binary uses that token to connect to the kube-api, so having an outdated token should forbid it from functioning properly, which would manifest as new pods in the data plane not being able to acquire a proper network config.
However, that failure isn't usually observed, except for the cases pointed out in linkerd/linkerd2#12573. The reason is that the token's actual lifetime is one year, due to kube-api's `--service-account-extend-token-expiration` [flag](https://kubernetes.io/docs/reference/command-line-tools-reference/kube-apiserver/#options) which is usually set as `true` to avoid breaking too many instances not yet adapted to use tokens with short expirations:

> Turns on projected service account expiration extension during token generation, which helps safe transition from legacy token to bound service account token feature. If this flag is enabled, admission injected tokens would be extended up to 1 year to prevent unexpected failure during transition, ignoring value of service-account-max-token-expiration.

## Repro

### AKS

The issue currently affects AKS clusters using OIDC keys. To reproduce, create a new cluster in AKS, making sure "Enable OIDC" and "Workload Identity" is ticked in the UI.

Then install the linkerd-cni plugin, labelling the linkerd-cni DaemonSet so that its ServiceAccount token is provided via OIDC:
```
linkerd install-cni --set-string "podLabels.azure\.workload\.identity/use"="true" | kubectl apply -f -
```

And install linkerd with cni enabled, and an injected instance of emojivoto.

The secret token is rotated after an hour, but the old one remains valid for a 24h. Manually rotating the key as detailed in the [docs](https://learn.microsoft.com/en-us/azure/aks/use-oidc-issuer#rotate-the-oidc-key) should invalidate the old key.

After that, bouncing any emojivoto pod will prove unsuccessful with the following event being raised:

```
Warning  FailedCreatePodSandBox  15s   kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "8121291446642b272cea9ee5f083958a37bab0dd7060c4d9c06bb05fecf911d2": plugin type="linkerd-cni" name="linkerd-cni" failed (add): Unauthorized
```

## Fix

This change adds a new function `monitor_service_account_token()` that monitors the rollout of the token file; which is a symlink whose target changes as a new token is deployed. When detecting a new token file, this function calls the new `create_kubeconfig()` function.

This change also removes the existing logic around the DELETE event, which is a leftover from previous changes and is now a no-op.

Also, as detailed in linkerd/linkerd2#13407, the ServiceAccount token has been removed from the cni config template because it's not used, simplifying things as we can regenerate the kubeconfig file without having to touch the cni config file.

Finally, the file `linkerd-cni.conf.default` has been removed as is not used.

## Test

Same as with the repro above, but use the cni-plugin image that contains the fix:

```
linkerd install-cni --set-string "podLabels.azure\.workload\.identity/use"="true" --set image.name="ghcr.io/alpeb/cni-plugin" --set image.version="v1.5.3" | kubectl apply -f -
```

After an hour when the token gets rotated you should see the event in the linkerd-cni pod logs.
alpeb added a commit to linkerd/linkerd2-proxy-init that referenced this issue Nov 29, 2024
Fixes linkerd/linkerd2#12573

## Problem

When deployed, the linkerd-cni pod gets its service account token mounted automatically by k8s:
```yaml
  - name: kube-api-access-729gv
    projected:
      defaultMode: 420
      sources:
      - serviceAccountToken:
          expirationSeconds: 3607
          path: token
      - configMap:
          items:
          - key: ca.crt
            path: ca.crt
          name: kube-root-ca.crt
      - downwardAPI:
          items:
          - fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
            path: namespace
```
According to this, the token is set to expire after an hour.
When the linkerd-cni pod starts it deploys the file `ZZZ-linkerd-cni-kubeconfig` in to the **host** file system.
That config contains the token sourced from `/var/run/secrets/kubernetes.io/serviceaccount` (mounted by the pod).
When the token gets rotated after an hour, that token file is updated but `ZZZ-linkerd-cni-kubeconfig` is not updated.
The `linkerd-cni` binary uses that token to connect to the kube-api, so having an outdated token should forbid it from functioning properly, which would manifest as new pods in the data plane not being able to acquire a proper network config.
However, that failure isn't usually observed, except for the cases pointed out in linkerd/linkerd2#12573. The reason is that the token's actual lifetime is one year, due to kube-api's `--service-account-extend-token-expiration` [flag](https://kubernetes.io/docs/reference/command-line-tools-reference/kube-apiserver/#options) which is usually set as `true` to avoid breaking too many instances not yet adapted to use tokens with short expirations:

> Turns on projected service account expiration extension during token generation, which helps safe transition from legacy token to bound service account token feature. If this flag is enabled, admission injected tokens would be extended up to 1 year to prevent unexpected failure during transition, ignoring value of service-account-max-token-expiration.

## Repro

### AKS

The issue currently affects AKS clusters using OIDC keys. To reproduce, create a new cluster in AKS, making sure "Enable OIDC" and "Workload Identity" is ticked in the UI.

Then install the linkerd-cni plugin, labelling the linkerd-cni DaemonSet so that its ServiceAccount token is provided via OIDC:
```
linkerd install-cni --set-string "podLabels.azure\.workload\.identity/use"="true" | kubectl apply -f -
```

And install linkerd with cni enabled, and an injected instance of emojivoto.

The secret token is rotated after an hour, but the old one remains valid for a 24h. Manually rotating the key as detailed in the [docs](https://learn.microsoft.com/en-us/azure/aks/use-oidc-issuer#rotate-the-oidc-key) should invalidate the old key.

After that, bouncing any emojivoto pod will prove unsuccessful with the following event being raised:

```
Warning  FailedCreatePodSandBox  15s   kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "8121291446642b272cea9ee5f083958a37bab0dd7060c4d9c06bb05fecf911d2": plugin type="linkerd-cni" name="linkerd-cni" failed (add): Unauthorized
```

## Fix

This change adds a new function `monitor_service_account_token()` that monitors the rollout of the token file; which is a symlink whose target changes as a new token is deployed. When detecting a new token file, this function calls the new `create_kubeconfig()` function.

This change also removes the existing logic around the DELETE event, which is a leftover from previous changes and is now a no-op.

Also, as detailed in linkerd/linkerd2#13407, the ServiceAccount token has been removed from the cni config template because it's not used, simplifying things as we can regenerate the kubeconfig file without having to touch the cni config file.

Finally, the file `linkerd-cni.conf.default` has been removed as is not used.

## Test

Same as with the repro above, but use the cni-plugin image that contains the fix:

```
linkerd install-cni --set-string "podLabels.azure\.workload\.identity/use"="true" --set image.name="ghcr.io/alpeb/cni-plugin" --set image.version="v1.5.3" | kubectl apply -f -
```

After an hour when the token gets rotated you should see the event in the linkerd-cni pod logs.
alpeb added a commit to linkerd/linkerd2-proxy-init that referenced this issue Dec 6, 2024
Fixes linkerd/linkerd2#12573

## Problem

When deployed, the linkerd-cni pod gets its service account token mounted automatically by k8s:
```yaml
  - name: kube-api-access-729gv
    projected:
      defaultMode: 420
      sources:
      - serviceAccountToken:
          expirationSeconds: 3607
          path: token
      - configMap:
          items:
          - key: ca.crt
            path: ca.crt
          name: kube-root-ca.crt
      - downwardAPI:
          items:
          - fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
            path: namespace
```
According to this, the token is set to expire after an hour.
When the linkerd-cni pod starts it deploys the file `ZZZ-linkerd-cni-kubeconfig` in to the **host** file system.
That config contains the token sourced from `/var/run/secrets/kubernetes.io/serviceaccount` (mounted by the pod).
When the token gets rotated after an hour, that token file is updated but `ZZZ-linkerd-cni-kubeconfig` is not updated.
The `linkerd-cni` binary uses that token to connect to the kube-api, so having an outdated token should forbid it from functioning properly, which would manifest as new pods in the data plane not being able to acquire a proper network config.
However, that failure isn't usually observed, except for the cases pointed out in linkerd/linkerd2#12573. The reason is that the token's actual lifetime is one year, due to kube-api's `--service-account-extend-token-expiration` [flag](https://kubernetes.io/docs/reference/command-line-tools-reference/kube-apiserver/#options) which is usually set as `true` to avoid breaking too many instances not yet adapted to use tokens with short expirations:

> Turns on projected service account expiration extension during token generation, which helps safe transition from legacy token to bound service account token feature. If this flag is enabled, admission injected tokens would be extended up to 1 year to prevent unexpected failure during transition, ignoring value of service-account-max-token-expiration.

## Repro

### AKS

The issue currently affects AKS clusters using OIDC keys. To reproduce, create a new cluster in AKS, making sure "Enable OIDC" and "Workload Identity" is ticked in the UI.

Then install the linkerd-cni plugin, labelling the linkerd-cni DaemonSet so that its ServiceAccount token is provided via OIDC:
```
linkerd install-cni --set-string "podLabels.azure\.workload\.identity/use"="true" | kubectl apply -f -
```

And install linkerd with cni enabled, and an injected instance of emojivoto.

The secret token is rotated after an hour, but the old one remains valid for a 24h. Manually rotating the key as detailed in the [docs](https://learn.microsoft.com/en-us/azure/aks/use-oidc-issuer#rotate-the-oidc-key) should invalidate the old key.

After that, bouncing any emojivoto pod will prove unsuccessful with the following event being raised:

```
Warning  FailedCreatePodSandBox  15s   kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "8121291446642b272cea9ee5f083958a37bab0dd7060c4d9c06bb05fecf911d2": plugin type="linkerd-cni" name="linkerd-cni" failed (add): Unauthorized
```

## Fix

This change adds a new function `monitor_service_account_token()` that monitors the rollout of the token file; which is a symlink whose target changes as a new token is deployed. When detecting a new token file, this function calls the new `create_kubeconfig()` function.

This change also removes the existing logic around the DELETE event, which is a leftover from previous changes and is now a no-op.

Also, as detailed in linkerd/linkerd2#13407, the ServiceAccount token has been removed from the cni config template because it's not used, simplifying things as we can regenerate the kubeconfig file without having to touch the cni config file.

Finally, the file `linkerd-cni.conf.default` has been removed as is not used.

## Test

Same as with the repro above, but use the cni-plugin image that contains the fix:

```
linkerd install-cni --set-string "podLabels.azure\.workload\.identity/use"="true" --set image.name="ghcr.io/alpeb/cni-plugin" --set image.version="v1.5.3" | kubectl apply -f -
```

After an hour when the token gets rotated you should see the event in the linkerd-cni pod logs.
alpeb added a commit that referenced this issue Dec 10, 2024
This change removes the `policy` entry from the cni config template, which isn't used. That contained a `__SERVICEACCOUNT_TOKEN__` placeholder, which was coupling this config file with the `ZZZ-linkerd-cni-kubeconfig` file generated by linkerd-cni. In linkerd/linkerd2-proxy-init#440 we add support for detecting changes in the mounted serviceaccount token file (see #12573), and the current change facilitates that effort.

Co-authored-by: Oliver Gould <ver@buoyant.io>
@alpeb
Copy link
Member

alpeb commented Dec 12, 2024

Hi folks, thank you all for the continued feedback!
We finally released a new linkerd-cni version fixing this issue:
https://github.com/linkerd/linkerd2-proxy-init/releases/tag/cni-plugin%2Fv1.6.0
It's not yet referred to as the default version in the linkerd2-cni chart, but it'd be great if you could give it a try, setting image.version: v1.6.0 in the values.yaml. Let me know how it goes! 🙂

@pdefreitas
Copy link

pdefreitas commented Dec 16, 2024

@alpeb I've tried this with linkerd-cni chart 24.11.8 and the issue still occurs:

    Image:          acrmecremote.azurecr.io/third-party/linkerd/cni-plugin:v1.6.0
  Warning  FailedCreatePodSandBox  12s   kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "": plugin type="linkerd-cni" name="linkerd-cni" failed (add): Unauthorized

AKS 1.30.5 running Azure CNI Node Subnet.

EDIT: I had to re-image all cluster nodes it seems the cluster got in a impaired network state. I will retest the scenario above again and update this comment.

EDIT2: Just confirmed the issue still persists when the OIDC token expires:

New pods fail to launch:

Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "***": plugin type="linkerd-cni" name="linkerd-cni" failed (add): Unauthorized

CNI pod fails to start on that instance where the pod fails to start:
linkerd-cni-5lngg 0/1 ContainerCreating 0 4m27s

Describing the CNI pod shows the same error:

Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "***": plugin type="linkerd-cni" name="linkerd-cni" failed (add): Unauthorized

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug env/aks Microsoft AKS
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants