Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Helm template - issues with how integration and oncall secrets are handled. #824

Open
ksa-real opened this issue Nov 10, 2022 · 9 comments
Labels
bug Something isn't working needs triage type/docs Grafana Labs internal. Docs Squad label across all Grafana Labs repos

Comments

@ksa-real
Copy link
Contributor

ksa-real commented Nov 10, 2022

Helm chart: 1.0.10
App: 1.0.51

The ask is to provide a better way to supply Slack and Telegram-related env variables into Pod. This cause the following issues. At this moment FEATURE_SLACK_INTEGRATION_ENABLED is set regardless of oncall.slack.enabled value (i.e. either True or False, no option to skip that). Also, it is not possible to have a duplicate env variable name. Also, this variable cannot be set from Grafana UI. So if we want Slack integration in Grafana OnCall, we MUST set oncall.slack.enabled: true. Setting that also sets SLACK_SLASH_COMMAND_NAME, SLACK_CLIENT_OAUTH_ID, SLACK_CLIENT_OAUTH_SECRET, SLACK_SIGNING_SECRET, and SLACK_INSTALL_RETURN_REDIRECT_HOST. SLACK_CLIENT_OAUTH_SECRET and SLACK_SIGNING_SECRET are secrets and should not be openly present in custom values.yaml file. Also, it is not possible to override env values with .Values.env (to use valueFrom.secretKeyRef) as this causes env duplicate. This sets TELEGRAM_WEBHOOK_HOST to empty string and removes a reasonable default of {base_url}.

Possible options:

  • Skip env variable, if its value from .Values is null. This includes FEATURE_SLACK_INTEGRATION_ENABLED. This allows setting the secret variables from .Values.env in any way a user wants.
  • (Better) Allow to specify variable from oncall.slack.* values and similar. E.g. by allow specifying map:
    oncall:
      slack:
        enabled: true
        clientSecret:
          valueFrom:
            secretKeyRef:
              name: oncall-engine
              key: slack-client-id
        signingSecret: # similarly value from secret
      telegram:
        enabled: true
        token: # similarly value from secret
        # webhookUrl: leaving this null should either remove the TELEGRAM_WEBHOOK_HOST variable,
        # so that Grafana OnCall plugin use the default base_url, or default to base_url in the Helm's _env.yaml
        # similarly to SLACK_INSTALL_RETURN_REDIRECT_HOST. The latter is more consistent IMO.
    

Another issue is that oncall secret with MIRAGE_CIPHER_IV, MIRAGE_SECRET_KEY, SECRET_KEY is re-created on every upgrade. This causes unnecessary rendered chart changes. Previously it invalidated grafana/oncall-engine connection and required new invite token. I guess, this was fixed (keeping fixed IV?). The solution is to add .Values.oncall.existingSecret and create oncall secret only if oncall.existingSecret is not set.

@ksa-real
Copy link
Contributor Author

BTW for context. I was beating my head about why integration with slack stopped working after the upgrade. I dug into the oncall engine code and figured out there is a FEATURE_SLACK_INTEGRATION_ENABLED variable. Then from Helm chart I figured out it is false by default. Then I looked through the git history and checked it was added later and was backward incompatible change. Also, the integration setup docs say nothing about this variable, which added confusion. The change was introduced pretty long ago, so probably most people already figured this out, but still, there may be someone like me who won't dig into the code. It may make sense to document feature-enabling variables in the docs.

@Matvey-Kuk Matvey-Kuk added bug Something isn't working type/docs Grafana Labs internal. Docs Squad label across all Grafana Labs repos labels Nov 10, 2022
@Matvey-Kuk
Copy link
Contributor

Thank you for this bug report! @alyssawada could you please collaborate with @vadimkerr about documenting ENV variables?

@ksa-real
Copy link
Contributor Author

ksa-real commented Nov 15, 2022

Can you please take a look at the PR?

@atmaniak
Copy link

I have the same problem here

@alyssawada
Copy link
Contributor

@Matvey-Kuk - Catching up on Github notifications post-offsite. Prioritizing this and will sync with @vadimkerr!

@alyssawada alyssawada self-assigned this Nov 22, 2022
@ksa-real
Copy link
Contributor Author

ksa-real commented Jan 12, 2023

@Matvey-Kuk @alyssawada
The original PR has been closed in favor of #1016. The new PR contains some parts of the original one and more. It is open for almost a month, so I encourage you to review it. It takes some effort to keep it actual.

@ksa-real
Copy link
Contributor Author

@Matvey-Kuk @alyssawada Ping again. Almost two months passed since the PR. Please, take a look. This is a useful change people may benefit from.

@ksa-real
Copy link
Contributor Author

Also, label is incorrect, this is not just docs.

@ksa-real
Copy link
Contributor Author

@Matvey-Kuk Can you please set the correct labels? It is helm-chart, not docs. Should I create another issue for #1016 and close the current one? I don't understand how to make code owners review the PR.

@alyssawada alyssawada removed their assignment Mar 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working needs triage type/docs Grafana Labs internal. Docs Squad label across all Grafana Labs repos
Projects
None yet
Development

No branches or pull requests

5 participants