-
Notifications
You must be signed in to change notification settings - Fork 107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(ssh): enable OS Login for GCP test instances #5602
Conversation
Sounds interesting, hope it works!
How will we test that this works before merging?
Can you document the infrastructure changes somewhere? |
Yes, running this PR multiple times is only thing I can think we have available at our disposal right now.
I'll be translating the manual changes to a terraform config |
Seems like the authentication issue I've been having is related to google-github-actions/setup-gcloud#586 Which was fixed 12 hours ago. |
Previous behavior: `gcloud` commands have been running without an appropiate authentication as the `auth` auction was sucessfully executed, but the actual gcloud CLI being used in further jobs was not using the correct configuration nor credentials Expected behavior: All `gcloud` commands should be properly configured and authenticated. Solution: Add the `google-github-actions/setup-gcloud` action after each `google-github-actions/auth` invocation, and before running any `gcloud` command. Remove the need of an OAuth Access token when not required by following steps
That's really unfortunate, it makes it tricky to test and merge. How can we make sure that SSH is more reliable with this change? Did you want to delay this change to the start of next week, so you and @dconnolly can fix any issues quickly? |
I started testing this with os-login enabled project wide. Attempt #1: ✅ Failed, but wasn't an SSH error (https://github.com/ZcashFoundation/zebra/actions/runs/3440425995/jobs/5754296381#step:6:65) Attempt #1.5: ✅ Success |
Co-authored-by: Deirdre Connolly <durumcrustulum@gmail.com>
I removed several SSH keys with the following command for i in $(gcloud compute os-login ssh-keys list --format="table[no-heading](value.fingerprint)") --impersonate-service-account=github-service-account@zealous-zebra.iam.gserviceaccount.com; do
echo "$i";
gcloud compute os-login ssh-keys remove --key "$i" --impersonate-service-account=github-service-account@zealous-zebra.iam.gserviceaccount.com || true;
done And re-running. I'll update this command later once this run finishes. I might also increase |
Once added, should this PR move out of draft and be priortized to merge? We're seeing the 32KB limit being hit on these jobs in multiple places as the switch got flipped on the gcloud side |
Now that you say this, maybe those other jobs are the ones generating all the SSH-key "garbage" in GCP, even though I applied the policy on GCP side. I think we can merge this, delete all keys and update all branches. Worst case scenario is reverting this, but it's hard to test with other jobs running and colliding with this. I'll move it out of draft and let you approve if you think this is a sane approach. |
This was my first thought when seeing it happen in the one other PR that has the |
@Mergifyio refresh |
✅ Pull request refreshed |
Motivation
We've been dealing with several different issues related to SSH connections. So far one of the solutions was to have a fixed SSH-key to connect from GitHub Actions, but this also reduces flexibility and maintainability, and goes against our security-wise decision of not having long-lived artifacts (JSON tokens, SSH keys) without expiration, which could be used in any point in time.
This PR implements OS Login as workaround (and possible root cause solution)
Note: Not all changes to make this work are being done in GitHub Actions, as some tasks are tied to changes in the infrastructure
Fixes: #5494
Fixes: #5069
Specifications
https://cloud.google.com/compute/docs/oslogin#benefits_of_os_login
Designs
Solution
busybox
instead of failingdebian-11
container image ci: compute engine VMs have not been using the expected OS (debian-11) #5618docker
commandReview
Anyone from DevOps after running this PR successfully ~5+ times
Reviewer Checklist
Follow Up Work
Configure OS-login project-wide after this PR has been merged.