Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(ssh): enable OS Login for GCP test instances #5602

Merged
merged 21 commits into from
Nov 16, 2022
Merged
Changes from 20 commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
dbea5c5
feat(ssh): enable OS Login for GCP test instances
gustavovalverde Nov 9, 2022
d8e8184
fix(ssh): force service account impersonation for OS Login
gustavovalverde Nov 9, 2022
f77bedb
Merge branch 'main' into use-fixed-ssh-key
gustavovalverde Nov 9, 2022
c0037b4
debug: show actual user trying to impersonate SA
gustavovalverde Nov 9, 2022
059346e
fix(glcloud): configure gcloud before running commands
gustavovalverde Nov 9, 2022
3ab4279
Merge branch 'main' into use-fixed-ssh-key
gustavovalverde Nov 10, 2022
6c70e10
fix(ssh): add VM zone to ssh command
gustavovalverde Nov 10, 2022
fcaa824
fix(auth): bringing changes from #5614
gustavovalverde Nov 10, 2022
89df8af
fix(auth): impersonation is working as expected now
gustavovalverde Nov 10, 2022
3cac65d
Merge branch 'main' into use-fixed-ssh-key
gustavovalverde Nov 10, 2022
1950514
fix(gcloud): setup the GCP CLI after authenticating (#5606)
gustavovalverde Nov 10, 2022
e318a20
fix(auth): revert to latest version
gustavovalverde Nov 10, 2022
6405fd1
fix: wrong replace
gustavovalverde Nov 10, 2022
1125f7c
fix(ci): use a specific debian image for VM containers
gustavovalverde Nov 10, 2022
552bcaf
fix(ssh): delete generated SSH keys by CI after 30 seconds
gustavovalverde Nov 10, 2022
d3a9e2d
Merge branch 'main' into use-fixed-ssh-key
gustavovalverde Nov 10, 2022
78cec48
debug: remove debug commands
gustavovalverde Nov 10, 2022
5206283
fix(compute): use a lightweight container image
gustavovalverde Nov 10, 2022
d12328a
fix(ci): add missing sudo to docker command
gustavovalverde Nov 10, 2022
eb73226
Update .github/workflows/deploy-gcp-tests.yml
gustavovalverde Nov 12, 2022
ef541ee
fix(ssh): delete ssh-keys for the specific GHA service account
gustavovalverde Nov 15, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
102 changes: 49 additions & 53 deletions .github/workflows/deploy-gcp-tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -145,11 +145,11 @@ jobs:
--boot-disk-size 300GB \
--boot-disk-type pd-ssd \
--create-disk name="${{ inputs.test_id }}-${{ env.GITHUB_SHA_SHORT }}",device-name="${{ inputs.test_id }}-${{ env.GITHUB_SHA_SHORT }}",size=300GB,type=pd-ssd \
--container-image debian-11 \
--container-image gcr.io/google-containers/busybox \
--container-restart-policy=never \
--machine-type ${{ env.MACHINE_TYPE }} \
--scopes cloud-platform \
--metadata=google-monitoring-enabled=true,google-logging-enabled=true \
--metadata=google-monitoring-enabled=TRUE,google-logging-enabled=TRUE,enable-oslogin=TRUE \
--metadata-from-file=startup-script=.github/workflows/scripts/gcp-vm-startup-script.sh \
--tags ${{ inputs.app_name }} \
--zone ${{ env.ZONE }}
Expand All @@ -160,10 +160,9 @@ jobs:
# SSH into the just created VM, and create a docker volume with the newly created disk.
- name: Create ${{ inputs.test_id }} Docker volume
run: |
gcloud compute ssh \
github-service-account@${{ inputs.test_id }}-${{ env.GITHUB_REF_SLUG_URL }}-${{ env.GITHUB_SHA_SHORT }} \
gcloud compute ssh ${{ inputs.test_id }}-${{ env.GITHUB_REF_SLUG_URL }}-${{ env.GITHUB_SHA_SHORT }} \
--ssh-key-expire-after=30s \
--zone ${{ env.ZONE }} \
--quiet \
--ssh-flag="-o ServerAliveInterval=5" \
--ssh-flag="-o ConnectionAttempts=20" \
--ssh-flag="-o ConnectTimeout=5" \
Expand Down Expand Up @@ -211,10 +210,9 @@ jobs:
# Launch the test without any cached state
- name: Launch ${{ inputs.test_id }} test
run: |
gcloud compute ssh \
github-service-account@${{ inputs.test_id }}-${{ env.GITHUB_REF_SLUG_URL }}-${{ env.GITHUB_SHA_SHORT }} \
gcloud compute ssh ${{ inputs.test_id }}-${{ env.GITHUB_REF_SLUG_URL }}-${{ env.GITHUB_SHA_SHORT }} \
--ssh-key-expire-after=30s \
--zone ${{ env.ZONE }} \
--quiet \
--ssh-flag="-o ServerAliveInterval=5" \
--ssh-flag="-o ConnectionAttempts=20" \
--ssh-flag="-o ConnectTimeout=5" \
Expand Down Expand Up @@ -366,11 +364,11 @@ jobs:
--boot-disk-size 300GB \
--boot-disk-type pd-ssd \
--create-disk image=${{ env.CACHED_DISK_NAME }},name="${{ inputs.test_id }}-${{ env.GITHUB_SHA_SHORT }}",device-name="${{ inputs.test_id }}-${{ env.GITHUB_SHA_SHORT }}",size=300GB,type=pd-ssd \
--container-image debian-11 \
--container-image gcr.io/google-containers/busybox \
--container-restart-policy=never \
--machine-type ${{ env.MACHINE_TYPE }} \
--scopes cloud-platform \
--metadata=google-monitoring-enabled=true,google-logging-enabled=true \
--metadata=google-monitoring-enabled=TRUE,google-logging-enabled=TRUE,enable-oslogin=TRUE \
--metadata-from-file=startup-script=.github/workflows/scripts/gcp-vm-startup-script.sh \
--tags ${{ inputs.app_name }} \
--zone ${{ env.ZONE }}
Expand All @@ -383,10 +381,9 @@ jobs:
# but the cached state can be smaller if we just increased the disk size.)
- name: Create ${{ inputs.test_id }} Docker volume
run: |
gcloud compute ssh \
github-service-account@${{ inputs.test_id }}-${{ env.GITHUB_REF_SLUG_URL }}-${{ env.GITHUB_SHA_SHORT }} \
gcloud compute ssh ${{ inputs.test_id }}-${{ env.GITHUB_REF_SLUG_URL }}-${{ env.GITHUB_SHA_SHORT }} \
--ssh-key-expire-after=30s \
--zone ${{ env.ZONE }} \
--quiet \
--ssh-flag="-o ServerAliveInterval=5" \
--ssh-flag="-o ConnectionAttempts=20" \
--ssh-flag="-o ConnectTimeout=5" \
Expand Down Expand Up @@ -452,10 +449,9 @@ jobs:
# TODO: we should find a better logic for this use cases
if: ${{ (inputs.needs_zebra_state && !inputs.needs_lwd_state) && inputs.test_id != 'lwd-full-sync' }}
run: |
gcloud compute ssh \
github-service-account@${{ inputs.test_id }}-${{ env.GITHUB_REF_SLUG_URL }}-${{ env.GITHUB_SHA_SHORT }} \
gcloud compute ssh ${{ inputs.test_id }}-${{ env.GITHUB_REF_SLUG_URL }}-${{ env.GITHUB_SHA_SHORT }} \
--ssh-key-expire-after=30s \
--zone ${{ env.ZONE }} \
--quiet \
--ssh-flag="-o ServerAliveInterval=5" \
--ssh-flag="-o ConnectionAttempts=20" \
--ssh-flag="-o ConnectTimeout=5" \
Expand Down Expand Up @@ -502,10 +498,9 @@ jobs:
# TODO: we should find a better logic for this use cases
if: ${{ (inputs.needs_zebra_state && inputs.needs_lwd_state) || inputs.test_id == 'lwd-full-sync' }}
run: |
gcloud compute ssh \
github-service-account@${{ inputs.test_id }}-${{ env.GITHUB_REF_SLUG_URL }}-${{ env.GITHUB_SHA_SHORT }} \
gcloud compute ssh ${{ inputs.test_id }}-${{ env.GITHUB_REF_SLUG_URL }}-${{ env.GITHUB_SHA_SHORT }} \
--ssh-key-expire-after=30s \
--zone ${{ env.ZONE }} \
--quiet \
--ssh-flag="-o ServerAliveInterval=5" \
--ssh-flag="-o ConnectionAttempts=20" \
--ssh-flag="-o ConnectTimeout=5" \
Expand Down Expand Up @@ -568,10 +563,9 @@ jobs:
# Errors in the tests are caught by the final test status job.
- name: Show logs for ${{ inputs.test_id }} test (sprout)
run: |
gcloud compute ssh \
github-service-account@${{ inputs.test_id }}-${{ env.GITHUB_REF_SLUG_URL }}-${{ env.GITHUB_SHA_SHORT }} \
gcloud compute ssh ${{ inputs.test_id }}-${{ env.GITHUB_REF_SLUG_URL }}-${{ env.GITHUB_SHA_SHORT }} \
--ssh-key-expire-after=30s \
--zone ${{ env.ZONE }} \
--quiet \
--ssh-flag="-o ServerAliveInterval=5" \
--ssh-flag="-o ConnectionAttempts=20" \
--ssh-flag="-o ConnectTimeout=5" \
Expand Down Expand Up @@ -627,10 +621,9 @@ jobs:
# Show recent logs, following until Canopy activation (or the test finishes)
- name: Show logs for ${{ inputs.test_id }} test (heartwood)
run: |
gcloud compute ssh \
github-service-account@${{ inputs.test_id }}-${{ env.GITHUB_REF_SLUG_URL }}-${{ env.GITHUB_SHA_SHORT }} \
gcloud compute ssh ${{ inputs.test_id }}-${{ env.GITHUB_REF_SLUG_URL }}-${{ env.GITHUB_SHA_SHORT }} \
--ssh-key-expire-after=30s \
--zone ${{ env.ZONE }} \
--quiet \
--ssh-flag="-o ServerAliveInterval=5" \
--ssh-flag="-o ConnectionAttempts=20" \
--ssh-flag="-o ConnectTimeout=5" \
Expand Down Expand Up @@ -683,10 +676,9 @@ jobs:
# Show recent logs, following until NU5 activation (or the test finishes)
- name: Show logs for ${{ inputs.test_id }} test (canopy)
run: |
gcloud compute ssh \
github-service-account@${{ inputs.test_id }}-${{ env.GITHUB_REF_SLUG_URL }}-${{ env.GITHUB_SHA_SHORT }} \
gcloud compute ssh ${{ inputs.test_id }}-${{ env.GITHUB_REF_SLUG_URL }}-${{ env.GITHUB_SHA_SHORT }} \
--ssh-key-expire-after=30s \
--zone ${{ env.ZONE }} \
--quiet \
--ssh-flag="-o ServerAliveInterval=5" \
--ssh-flag="-o ConnectionAttempts=20" \
--ssh-flag="-o ConnectTimeout=5" \
Expand Down Expand Up @@ -741,10 +733,9 @@ jobs:
# Show recent logs, following until block 1,740,000 (or the test finishes)
- name: Show logs for ${{ inputs.test_id }} test (1740k)
run: |
gcloud compute ssh \
github-service-account@${{ inputs.test_id }}-${{ env.GITHUB_REF_SLUG_URL }}-${{ env.GITHUB_SHA_SHORT }} \
gcloud compute ssh ${{ inputs.test_id }}-${{ env.GITHUB_REF_SLUG_URL }}-${{ env.GITHUB_SHA_SHORT }} \
--ssh-key-expire-after=30s \
--zone ${{ env.ZONE }} \
--quiet \
--ssh-flag="-o ServerAliveInterval=5" \
--ssh-flag="-o ConnectionAttempts=20" \
--ssh-flag="-o ConnectTimeout=5" \
Expand Down Expand Up @@ -801,10 +792,9 @@ jobs:
# Show recent logs, following until block 1,760,000 (or the test finishes)
- name: Show logs for ${{ inputs.test_id }} test (1760k)
run: |
gcloud compute ssh \
github-service-account@${{ inputs.test_id }}-${{ env.GITHUB_REF_SLUG_URL }}-${{ env.GITHUB_SHA_SHORT }} \
gcloud compute ssh ${{ inputs.test_id }}-${{ env.GITHUB_REF_SLUG_URL }}-${{ env.GITHUB_SHA_SHORT }} \
--ssh-key-expire-after=30s \
--zone ${{ env.ZONE }} \
--quiet \
--ssh-flag="-o ServerAliveInterval=5" \
--ssh-flag="-o ConnectionAttempts=20" \
--ssh-flag="-o ConnectTimeout=5" \
Expand Down Expand Up @@ -861,10 +851,9 @@ jobs:
# Show recent logs, following until block 1,780,000 (or the test finishes)
- name: Show logs for ${{ inputs.test_id }} test (1780k)
run: |
gcloud compute ssh \
github-service-account@${{ inputs.test_id }}-${{ env.GITHUB_REF_SLUG_URL }}-${{ env.GITHUB_SHA_SHORT }} \
gcloud compute ssh ${{ inputs.test_id }}-${{ env.GITHUB_REF_SLUG_URL }}-${{ env.GITHUB_SHA_SHORT }} \
--ssh-key-expire-after=30s \
--zone ${{ env.ZONE }} \
--quiet \
--ssh-flag="-o ServerAliveInterval=5" \
--ssh-flag="-o ConnectionAttempts=20" \
--ssh-flag="-o ConnectTimeout=5" \
Expand Down Expand Up @@ -922,10 +911,9 @@ jobs:
# Show recent logs, following until block 1,800,000 (or the test finishes)
- name: Show logs for ${{ inputs.test_id }} test (1800k)
run: |
gcloud compute ssh \
github-service-account@${{ inputs.test_id }}-${{ env.GITHUB_REF_SLUG_URL }}-${{ env.GITHUB_SHA_SHORT }} \
gcloud compute ssh ${{ inputs.test_id }}-${{ env.GITHUB_REF_SLUG_URL }}-${{ env.GITHUB_SHA_SHORT }} \
--ssh-key-expire-after=30s \
--zone ${{ env.ZONE }} \
--quiet \
--ssh-flag="-o ServerAliveInterval=5" \
--ssh-flag="-o ConnectionAttempts=20" \
--ssh-flag="-o ConnectTimeout=5" \
Expand Down Expand Up @@ -982,10 +970,9 @@ jobs:
# Show recent logs, following until block 1,820,000 (or the test finishes)
- name: Show logs for ${{ inputs.test_id }} test (1820k)
run: |
gcloud compute ssh \
github-service-account@${{ inputs.test_id }}-${{ env.GITHUB_REF_SLUG_URL }}-${{ env.GITHUB_SHA_SHORT }} \
gcloud compute ssh ${{ inputs.test_id }}-${{ env.GITHUB_REF_SLUG_URL }}-${{ env.GITHUB_SHA_SHORT }} \
--ssh-key-expire-after=30s \
--zone ${{ env.ZONE }} \
--quiet \
--ssh-flag="-o ServerAliveInterval=5" \
--ssh-flag="-o ConnectionAttempts=20" \
--ssh-flag="-o ConnectTimeout=5" \
Expand Down Expand Up @@ -1041,10 +1028,9 @@ jobs:
# TODO: when doing obtain/extend tips, log the verifier in use, and check for full verification here
- name: Show logs for ${{ inputs.test_id }} test (checkpoint)
run: |
gcloud compute ssh \
github-service-account@${{ inputs.test_id }}-${{ env.GITHUB_REF_SLUG_URL }}-${{ env.GITHUB_SHA_SHORT }} \
gcloud compute ssh ${{ inputs.test_id }}-${{ env.GITHUB_REF_SLUG_URL }}-${{ env.GITHUB_SHA_SHORT }} \
--ssh-key-expire-after=30s \
--zone ${{ env.ZONE }} \
--quiet \
--ssh-flag="-o ServerAliveInterval=5" \
--ssh-flag="-o ConnectionAttempts=20" \
--ssh-flag="-o ConnectTimeout=5" \
Expand Down Expand Up @@ -1111,10 +1097,9 @@ jobs:
# (`docker wait` can also wait for multiple containers, but we only ever wait for a single container.)
- name: Result of ${{ inputs.test_id }} test
run: |
gcloud compute ssh \
github-service-account@${{ inputs.test_id }}-${{ env.GITHUB_REF_SLUG_URL }}-${{ env.GITHUB_SHA_SHORT }} \
gcloud compute ssh ${{ inputs.test_id }}-${{ env.GITHUB_REF_SLUG_URL }}-${{ env.GITHUB_SHA_SHORT }} \
--ssh-key-expire-after=30s \
--zone ${{ env.ZONE }} \
--quiet \
--ssh-flag="-o ServerAliveInterval=5" \
--ssh-flag="-o ConnectionAttempts=20" \
--ssh-flag="-o ConnectTimeout=5" \
Expand Down Expand Up @@ -1237,15 +1222,14 @@ jobs:
SYNC_HEIGHT=""

DOCKER_LOGS=$( \
gcloud compute ssh \
github-service-account@${{ inputs.test_id }}-${{ env.GITHUB_REF_SLUG_URL }}-${{ env.GITHUB_SHA_SHORT }} \
gcloud compute ssh ${{ inputs.test_id }}-${{ env.GITHUB_REF_SLUG_URL }}-${{ env.GITHUB_SHA_SHORT }} \
--ssh-key-expire-after=30s \
--zone ${{ env.ZONE }} \
--quiet \
--ssh-flag="-o ServerAliveInterval=5" \
--ssh-flag="-o ConnectionAttempts=20" \
--ssh-flag="-o ConnectTimeout=5" \
--command=" \
docker logs ${{ inputs.test_id }} --tail 200 \
sudo docker logs ${{ inputs.test_id }} --tail 200 \
")

SYNC_HEIGHT=$( \
Expand Down Expand Up @@ -1376,3 +1360,15 @@ jobs:
else
gcloud compute instances delete "${INSTANCE}" --zone "${{ env.ZONE }}" --delete-disks all --quiet
fi

# Deletes SSH keys generated during this workflow run, as GCP has a limit of SSH keys
# that can exist at the same time in the OS Login metadata. Not deleting this keys
# could cause the following error:
# `Login profile size exceeds 32 KiB. Delete profile values to make additional space`
- name: Delete temporal SSH keys
continue-on-error: true
run: |
for i in $(gcloud compute os-login ssh-keys list --format="table[no-heading](value.fingerprint)"); do
echo "$i";
gcloud compute os-login ssh-keys remove --key "$i" || true;
done