Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Drift checks are failing #101

Open
braaar opened this issue Jan 9, 2024 · 21 comments
Open

Drift checks are failing #101

braaar opened this issue Jan 9, 2024 · 21 comments

Comments

@braaar
Copy link
Member

braaar commented Jan 9, 2024

The issue so far:

  • The drift checks are failing because of missing or invalid github access tokens
  • We want to start using pulumi ESC environments
@braaar
Copy link
Member Author

braaar commented Jan 9, 2024

We have set up a pulumi environment for secrets wherein we can store all the access tokens we use for pulumi workflows. This way there will be a single source of truth and we won't have to update tokens all over the place all the time

@braaar
Copy link
Member Author

braaar commented Jan 9, 2024

#99 and bjerkio/bot#43 are meant to phase out usage of github tokens in pulumi config files and read them from the environment instead.

It seems that these stacks are able to read from the environment, but they still seem to be getting 401 replys from the github API. Strangely, though, when using the access token in the environment with the github CLI we are able to run these queries to the github API (such as gh api orgs/bjerkio/actions/secrets/BJERKBOT_GITHUB_TOKEN). Could it be that the tokens aren't being read correctly after all?

@braaar
Copy link
Member Author

braaar commented Jan 9, 2024

It appears there's no difference between using a fine-grained token or a classic token.

We should try to make pulumi throw an error if the github token is undefined so that we don't end up querying the github API without a token.

@braaar
Copy link
Member Author

braaar commented Jan 9, 2024

We are using requireSecret, so it will throw if the secret is missing.

I have a new hypothesis: It could be that the pulumi stack is trying to read the access token and only sees the encrypted value. The documentation doesn't speifically mention a use case where the pulumi stacks directly read secrets from pulumi ESC, but rather talks about how developers can read the secrets by unsing the CLI. Perhaps you are not meant to access secrets programatically like we are trying to do.

This would explain why we are not seeing any errors from requireSecret. We are seeing a value, but it could be encrypted in such a way that the stack can't decrypt it.

@braaar
Copy link
Member Author

braaar commented Jan 9, 2024

It seems that this hypothesis is false. We can read secrets from pulumi ESC.

@braaar
Copy link
Member Author

braaar commented Jan 9, 2024

We discovered that getToken from get-pulumi-secret was throwing an error, but this did not appear when running the preview action in github actions. Could it be that somehow the preview action is not behaving correctly because of this?

Side note: Perhaps the refresh option of the preview github action could be causing misbehaviour?

@braaar
Copy link
Member Author

braaar commented Jan 9, 2024

refresh: true was the sinner

Here is the evidence pulumi refresh is run before applying the update, and it will then use expired credentials to do so.

@braaar
Copy link
Member Author

braaar commented Jan 9, 2024

We are getting an error when running github actions still. This was likely caused by me getting admin permissions in pulumi cloud, which downgraded @simenandre's access token's permissions.

I have created a new access token and overwritten the PULUMI_ACCESS_TOKEN secret in bjerkio/conf and bjerkio/bot

@braaar
Copy link
Member Author

braaar commented Jan 9, 2024

Refreshing before applying changes is problematic when tokens already have expired. At that point the refresh is doomed to fail since it runs with outdated config values and we must either avoid refreshing under such circumstances or prevent such circumstances from occurring in the first place. We have turned off refreshing on the pull request and push actions, and left it in in the drift check actions for now.

We should discuss whether we want to revert this change once everything is green again and we can (hopefullly) have continuous and smooth operation (as ensured by the get-pulumi-secrets 10 day expiration warning)

@braaar
Copy link
Member Author

braaar commented Jan 9, 2024

This action is failing because the github access token we have set up in pulumi ESC only has access to the org bjerkio. We must either use a classic token or have separate tokens for each org

@braaar
Copy link
Member Author

braaar commented Jan 10, 2024

bjerkio/bot#45 adds getbranches as a separate token

@braaar
Copy link
Member Author

braaar commented Jan 10, 2024

#99 is ready to land now. It now retrieves github tokens from the config (which come from the secrets environment) individually for each org: getbranches, flexisoftorg, bjerkio

@braaar
Copy link
Member Author

braaar commented Jan 10, 2024

Drift check should succeed today. We'll know in two hours 😄

@braaar
Copy link
Member Author

braaar commented Jan 10, 2024

Drift checks here, on bjerkio/bot and getbranches/conf are still failing. The preview ends up with some changes, which I can't quite understand since the same does not occur when I run the same preview command in a terminal.

@braaar
Copy link
Member Author

braaar commented Jan 10, 2024

In the case of getbranches, the problem may simply be that the pulumi access token configured in getbranches/conf is no longer valid

@braaar
Copy link
Member Author

braaar commented Jan 10, 2024

bjerkio/bot#48 fixed the failing drift check in bjerkio/bot.

@braaar
Copy link
Member Author

braaar commented Jan 10, 2024

#104 fixes some issues with using incorrect github tokens on bjerkio/conf

@braaar
Copy link
Member Author

braaar commented Jan 10, 2024

Fumbled around with some changes on bjerkio/conf, but I'm seeing red checks across the board right now. I'm seeing 403 errors towards the github API on getbranches and flexisoftorg.

@braaar
Copy link
Member Author

braaar commented Jan 10, 2024

Perhaps the access tokens on getbranches and flexisoftorg have been overwritten with incorrect values?

@braaar
Copy link
Member Author

braaar commented Jan 11, 2024

I added a suffix to the github provider names in #107, which also deletes some old dangling github providers that existed in the pulumi state

@braaar
Copy link
Member Author

braaar commented Jan 11, 2024

bjerkio/bot#51 adds the option of running a refresh and pulumi up action manually through github actions. It's necessary to run this action after generating new access tokens which affect a stack, otherwise the drift check will get upset that things have changed unexpectedly. The pulumi environment is an external dependency which can cause these "unexpected" changes to the resources.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant