Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Existing AccessPolicyTokens gets deleted and stuck on provider-grafana pod restart #178

Open
DMarby opened this issue Sep 6, 2024 · 2 comments
Assignees

Comments

@DMarby
Copy link

DMarby commented Sep 6, 2024

When the provider-grafana pod is restarted, any AccessPolicyToken resources that are present, deletes any existing tokens, and then gets stuck attempting to recreate them.

Running Crossplane version 1.16.0, and provider-grafana 1.8.0.

Provider logs:

2024/09/06 12:08:09 [DEBUG] GET https://grafana.com/api/v1/tokens/<token-id>?region=us
2024/09/06 12:08:09 [DEBUG] DELETE https://grafana.com/api/v1/tokens/<token-id>?region=us
2024/09/06 12:08:10 [DEBUG] POST https://grafana.com/api/v1/tokens?region=
2024/09/06 12:08:11 [DEBUG] GET https://grafana.com/api/v1/tokens/<token-id>?region=us
2024/09/06 12:08:13 [DEBUG] GET https://grafana.com/api/v1/tokens/<token-id>?region=us
2024/09/06 12:08:17 [DEBUG] GET https://grafana.com/api/v1/tokens/<token-id>?region=us
2024/09/06 12:08:25 [DEBUG] GET https://grafana.com/api/v1/tokens/<token-id>?region=us
2024/09/06 12:08:42 [DEBUG] GET https://grafana.com/api/v1/tokens/<token-id>?region=us
2024/09/06 12:09:14 [DEBUG] GET https://grafana.com/api/v1/tokens/<token-id>?region=us
2024/09/06 12:10:14 [DEBUG] GET https://grafana.com/api/v1/tokens/<token-id>?region=us
2024/09/06 12:11:14 [DEBUG] GET https://grafana.com/api/v1/tokens/<token-id>?region=us

Events from the AccessPolicyToken resource:

Events:
  Type     Reason                        Age    From                                                                  Message
  ----     ------                        ----   ----                                                                  -------
  Warning  CannotUpdateExternalResource  6m54s  managed/cloud.grafana.crossplane.io/v1alpha1, kind=accesspolicytoken  failed to
 update the resource: [{0 409 Conflict 409 Conflict
{
  "code": "InvalidArgument",
  "message": "Field is required: region",
  "requestId": "9024b1f2-e154-4d74-a647-3a21182e8219"
} []}]
  Warning  CannotObserveExternalResource  49s (x11 over 6m53s)  managed/cloud.grafana.crossplane.io/v1alpha1, kind=accesspolicy
token  failed to observe the resource: [{0 error reading policy token with ID`us:d37c1ccd-8f60-4ae6-883f-33971463637a`: 404 Not
 Found  []}]

Example existing AccessPolicyToken object (as part of a composition):

      name: grafana-api-key
      base:
        apiVersion: cloud.grafana.crossplane.io/v1alpha1
        kind: AccessPolicyToken
        spec:
          forProvider:
            region: us
            name: foo
            displayName: foo
            accessPolicyId: foo
            providerConfigRef:
              name: default
          writeConnectionSecretToRef:
            name: foo
            namespace: crossplane-system

This was seemingly introduced by #135

@Duologic
Copy link
Member

Duologic commented Nov 5, 2024

We're seeing this internally too, see grafana/terraform-provider-grafana#1886 in an attempt to fix this.

@fe-ax
Copy link

fe-ax commented Jan 22, 2025

With the latest version I'm still seeing (see the second log line):

2025/01/22 19:17:42 [DEBUG] DELETE https://grafana.com/api/v1/tokens/51fGUID3b?region=prod-eu-west-2
2025/01/22 19:17:42 [DEBUG] POST https://grafana.com/api/v1/tokens?region=
2025/01/22 19:17:44 [DEBUG] GET https://grafana.com/api/v1/tokens/51fGUID3b?region=prod-eu-west-2
2025/01/22 19:17:49 [DEBUG] GET https://grafana.com/api/v1/tokens/51fGUID3b?region=prod-eu-west-2
2025/01/22 19:17:57 [DEBUG] GET https://grafana.com/api/v1/tokens/51fGUID3b?region=prod-eu-west-2
2025/01/22 19:18:14 [DEBUG] GET https://grafana.com/api/v1/tokens/51fGUID3b?region=prod-eu-west-2
2025/01/22 19:18:14 [DEBUG] POST https://grafana.com/api/v1/tokens?region=prod-eu-west-2
2025/01/22 19:18:14 [DEBUG] GET https://grafana.com/api/v1/tokens/51fGUID3b?region=prod-eu-west-2
2025/01/22 19:18:15 [DEBUG] GET https://grafana.com/api/v1/tokens/51fGUID3b?region=prod-eu-west-2
2025/01/22 19:18:47 [DEBUG] GET https://grafana.com/api/v1/tokens/51fGUID3b?region=prod-eu-west-2
2025/01/22 19:18:47 [DEBUG] POST https://grafana.com/api/v1/tokens?region=prod-eu-west-2
2025/01/22 19:18:48 [DEBUG] GET https://grafana.com/api/v1/tokens/bf879GUID5997?region=prod-eu-west-2
2025/01/22 19:18:48 [DEBUG] GET https://grafana.com/api/v1/tokens/bf879GUID5997?region=prod-eu-west-2
2025/01/22 19:18:49 [DEBUG] GET https://grafana.com/api/v1/tokens/bf879GUID5997?region=prod-eu-west-2
2025/01/22 19:20:43 [DEBUG] GET https://grafana.com/api/v1/tokens/bf879GUID5997?region=prod-eu-west-2
2025/01/22 19:20:44 [DEBUG] GET https://grafana.com/api/v1/tokens/bf879GUID5997?region=prod-eu-west-2
apiVersion: cloud.grafana.crossplane.io/v1alpha1
kind: AccessPolicyToken
metadata:
  labels:
    testing.upbound.io/example-name: test
  name: prometheus-access-policy-token
spec:
  providerConfigRef:
    name: grafana-cloud-provider
  forProvider:
    accessPolicySelector:
      matchLabels:
        test.it/grafana-access-policy: prometheus
    displayName: Prometheus Access Policy Token
    name: prometheus-access-policy-token
    region: prod-eu-west-2
  writeConnectionSecretToRef:
    name: prometheus-access-policy-token
    namespace: grafana-cloud
---
apiVersion: cloud.grafana.crossplane.io/v1alpha1
kind: AccessPolicy
metadata:
  labels:
    test.it/grafana-access-policy: prometheus
  name: prometheus-access-policy
spec:
  providerConfigRef:
    name: grafana-cloud-provider
  forProvider:
    displayName: Prometheus Access Policy
    name: prometheus-access-policy
    realm:
    - identifier: "0000000" # Changed for github post
      type: stack
    region: prod-eu-west-2
    scopes:
      - logs:write

The Loki pods give:

ts=2025-01-22T18:49:04.582609283Z level=error msg="final error sending batch" component_path=/ component_id=loki.write.hostedlogs component=client host=logs-prod-012.grafana.net status=401 tenant="" error="server returned HTTP status 401 Unauthorized (401): {"status":"error","error":"authentication error: legacy auth cannot be upgraded because the host is not found"}"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants