Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crashes read/write/backend using Azure blob with AAD workload identity. #9952

Closed
monaka opened this issue Jul 17, 2023 · 6 comments · Fixed by #13195
Closed

Crashes read/write/backend using Azure blob with AAD workload identity. #9952

monaka opened this issue Jul 17, 2023 · 6 comments · Fixed by #13195

Comments

@monaka
Copy link

monaka commented Jul 17, 2023

Describe the bug

I tried to use Azure blob with AAD workflow identity.
I got errors on read/write/backend.

To Reproduce

  1. Deploys loki by ArgoCD App with these parameters. ( You will reproduce it without ArgoCD. )
project: default
source:
  repoURL: 'https://grafana.github.io/helm-charts'
  targetRevision: 5.8.9
  helm:
    parameters:
      - name: backend.persistence.enableStatefulSetAutoDeletePVC
        value: 'false'
      - name: loki.podLabels.azure\.workload\.identity/use
        value: 'true'
        forceString: true
      - name: loki.storage.type
        value: azure
      - name: loki.storage.azure.accountName
        value: {{snip}}
      - name: loki.storage.azure.useFederatedToken
        value: 'true'
      - name: minio.enabled
        value: 'false'
      - name: monitoring.selfMonitoring.grafanaAgent.installOperator
        value: 'false'
      - name: read.persistence.enableStatefulSetAutoDeletePVC
        value: 'false'
  chart: loki
destination:
  server: 'https://kubernetes.default.svc'
  namespace: loki
syncPolicy:
  automated:
    prune: true
    selfHeal: true
  syncOptions:
    - CreateNamespace=true
    - 
  1. Just watching.
  2. Componets are in CrashLoopBackoff.

Expected behavior

All components are booted up and moving to Running.

Environment:

  • Infrastructure: Kubernetes
  • Deployment tool: helm

Screenshots, Promtail config, or terminal output

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x1782a2d]

goroutine 1 [running]:
github.com/Azure/go-autorest/autorest/adal.(*ServicePrincipalToken).SetCustomRefreshFunc(...)
        /src/loki/vendor/github.com/Azure/go-autorest/autorest/adal/token.go:411
github.com/grafana/loki/pkg/storage/chunk/client/azure.(*BlobStorage).getServicePrincipalToken(0xc0006a8800, {0x25c8db8?, 0x25c8dc0?})
        /src/loki/pkg/storage/chunk/client/azure/blob_storage_client.go:414 +0x36d
github.com/grafana/loki/pkg/storage/chunk/client/azure.(*BlobStorage).getOAuthToken(0xc0006a8800)
        /src/loki/pkg/storage/chunk/client/azure/blob_storage_client.go:359 +0x105
github.com/grafana/loki/pkg/storage/chunk/client/azure.(*BlobStorage).newPipeline(0xc0006a8800, {0xee6b280, 0x3, 0x14}, 0x0)
        /src/loki/pkg/storage/chunk/client/azure/blob_storage_client.go:343 +0x25b
github.com/grafana/loki/pkg/storage/chunk/client/azure.NewBlobStorage(0xc00027ad20, {0xc0004bc348?, {0x29fe7e0?, 0xc0004b76e0?}}, {0x0?, 0x0?, 0x0?})
        /src/loki/pkg/storage/chunk/client/azure/blob_storage_client.go:197 +0x14c
github.com/grafana/loki/pkg/storage.NewObjectClient({_, _}, {{{0x0, 0x0}, {0x0, 0x0}, {0x0, 0x0}, {0x0, 0x0}}, ...}, ...)
        /src/loki/pkg/storage/factory.go:515 +0x985
github.com/grafana/loki/pkg/storage.NewChunkClient({_, _}, {{{0x0, 0x0}, {0x0, 0x0}, {0x0, 0x0}, {0x0, 0x0}}, ...}, ...)
        /src/loki/pkg/storage/factory.go:340 +0x6b4
github.com/grafana/loki/pkg/storage.(*store).chunkClientForPeriod(0xc000314600, {{0x17e466f3400}, {0xc000a30cd0, 0xe}, {0xc000a30ca8, 0x5}, {0xc000a30cc0, 0x3}, {{0xc000a30c80, 0xb}, ...}, ...})
        /src/loki/pkg/storage/store.go:185 +0x27c
github.com/grafana/loki/pkg/storage.(*store).init(0xc000314600)
        /src/loki/pkg/storage/store.go:155 +0xf8
github.com/grafana/loki/pkg/storage.NewStore({{{0x0, 0x0}, {0x0, 0x0}, {0x0, 0x0}, {0x0, 0x0}}, {{{0x0}, 0x4000000000000000, ...}, ...}, ...}, ...)
        /src/loki/pkg/storage/store.go:147 +0xa3b
github.com/grafana/loki/pkg/loki.(*Loki).initStore(0xc000948000)
        /src/loki/pkg/loki/modules.go:655 +0x598
github.com/grafana/dskit/modules.(*Manager).initModule(0xc0004b9080, {0x7ffeef0c550b, 0x5}, 0x1?, 0xc000635c20?)
        /src/loki/vendor/github.com/grafana/dskit/modules/modules.go:120 +0x20a
github.com/grafana/dskit/modules.(*Manager).InitModuleServices(0x856d54?, {0xc0008e4670, 0x1, 0xc0008e4940?})
        /src/loki/vendor/github.com/grafana/dskit/modules/modules.go:92 +0xf8
github.com/grafana/loki/pkg/loki.(*Loki).Run(0xc000948000, {0xc0008e8980?})
        /src/loki/pkg/loki/loki.go:457 +0x56
main.main()
        /src/loki/cmd/loki/main.go:110 +0xe65
@monaka
Copy link
Author

monaka commented Jul 17, 2023

Additional info:

There has a Cert-manager with AAD workload identity in the same AKS cluster.
It works with no trouble. So I believe that base settings are done.

@mikbonda
Copy link

mikbonda commented Aug 8, 2023

I have no issues writing to an Azure storage account with AAD Workload Identity with the configuration below. Are annotations configured on the serviceAccount? I would also double check the federated credentials/role assignments is properly configured on Azure.

loki:
  serviceAccount:
    annotations:
      azure.workload.identity/tenant-id: "<tenant-id-for-azure-account>"
      azure.workload.identity/client-id:  "<client-id-for-managed-identity>"
  loki:
    storage:
      type: azure
      azure:
        accountName: <storage-account-name>
        accountKey: null
        useManagedIdentity: false
        useFederatedToken: true

@KuDuG
Copy link

KuDuG commented Aug 10, 2023

I have the same issue with one of my subscriptions.
As soon as I enable the use Federated Token this issue appears.
So I have two different subscriptions a private and a company.
My private setup is way-less complex than the company subscription. If I configure this in my private subscription it's working correctly.

For some reason the company subscription I got a runtime error.
I used workload identity with other stuff as well without any issue so even if I believe my configuration is correct I have to admit that the issue occurrence is related to the config because if I enable the use Federated Token feature without configuring properly this issue also appears in my private subscription where it worked if a configured properly.
However throwing an invalid memory address or nil pointer dereference, segmentation violation runtime error. clearly an application-side issue and hard to find the config issue without getting a proper error message so it would be nice if someone can have a look at this issue.

Env. K8s. ver: 1.26.6 Loki App ver: 2.8.2
I believe you can reproduce this issue if you enable use Federated Token with azure storage type without making any other required configuration for this feature.
Thank you for your help in advance.

`panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x1782a2d]

goroutine 1 [running]:
github.com/Azure/go-autorest/autorest/adal.(*ServicePrincipalToken).SetCustomRefreshFunc(...)
/src/loki/vendor/github.com/Azure/go-autorest/autorest/adal/token.go:411
github.com/grafana/loki/pkg/storage/chunk/client/azure.(*BlobStorage).getServicePrincipalToken(0xc0000c2700, {0x25c8db8?, 0x25c8dc0?})
/src/loki/pkg/storage/chunk/client/azure/blob_storage_client.go:414 +0x36d
github.com/grafana/loki/pkg/storage/chunk/client/azure.(*BlobStorage).getOAuthToken(0xc0000c2700)
/src/loki/pkg/storage/chunk/client/azure/blob_storage_client.go:359 +0x105
github.com/grafana/loki/pkg/storage/chunk/client/azure.(*BlobStorage).newPipeline(0xc0000c2700, {0xee6b280, 0x3, 0x14}, 0x0)
/src/loki/pkg/storage/chunk/client/azure/blob_storage_client.go:343 +0x25b
github.com/grafana/loki/pkg/storage/chunk/client/azure.NewBlobStorage(0xc0009a60f0, {0xc00011a6f0?, {0x29fe7e0?, 0xc0003d9f80?}}, {0x34630b8a000?, 0x8bb2c97000?, 0xdf8475800?})
/src/loki/pkg/storage/chunk/client/azure/blob_storage_client.go:197 +0x14c
github.com/grafana/loki/pkg/storage.NewObjectClient({_, _}, {{{0x0, 0x0}, {0x0, 0x0}, {0x0, 0x0}, {0x0, 0x0}}, ...}, ...)
/src/loki/pkg/storage/factory.go:515 +0x985
github.com/grafana/loki/pkg/loki.(*Loki).initUsageReport(0xc0005d6800)
/src/loki/pkg/loki/modules.go:1182 +0x247
github.com/grafana/dskit/modules.(*Manager).initModule(0xc000117458, {0x7ffde5172c40, 0x7}, 0x1?, 0xc000567ce0?)
/src/loki/vendor/github.com/grafana/dskit/modules/modules.go:120 +0x20a
github.com/grafana/dskit/modules.(*Manager).InitModuleServices(0x856d54?, {0xc00074e080, 0x1, 0xc00074f8e0?})
/src/loki/vendor/github.com/grafana/dskit/modules/modules.go:92 +0xf8
github.com/grafana/loki/pkg/loki.(*Loki).Run(0xc0005d6800, {0xc00065e740?})
/src/loki/pkg/loki/loki.go:457 +0x56
main.main()
/src/loki/cmd/loki/main.go:110 +0xe65`

@c3JpbmkK
Copy link

c3JpbmkK commented Aug 28, 2023

I noticed the same issue today after updating to the latest version of loki-distributed chart because we wanted to use workload identity instead of pod identity. In our case, the issue was quickly solved because I noticed the AZURE_CLIENT_ID on the pod was empty (other injected env variables were valid) and rectified it (ensured that the service account had the right client-id annotation configured). After that there was no issues because the permissions were already configured for the managed identity, and it was an easy switch from pod identity to workload identity.

@RenePinnow
Copy link

I have the same problem. @monaka did you find any solution?

@kamikaze
Copy link

kamikaze commented Aug 7, 2024

same here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants