Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pkg/steps/clusterinstall/template: Get region from Boskos lease #1527

Merged
merged 1 commit into from
Dec 14, 2020

Conversation

wking
Copy link
Member

@wking wking commented Dec 11, 2020

Like openshift/release@7aa198b3c7 (openshift/release#12584), but for AWS and GCP here too, now that we have evidence that the approach is working.

I'm keeping a switch for AWS to give folks a pattern for selecting zones, if AWS breaks a zone in a particular region. We should probably distribute that (and the shared subnets, for shared-subnet tests?) via leases as well, but baby steps.

Like openshift/release@7aa198b3c7
(ci-operator/step-registry/ipi/conf/azure: Get region from Boskos
lease, 2020-10-14, openshift/release#12584), but for AWS and GCP here
too, now that we have evidence that the approach is working.

I'm keeping a switch for AWS to give folks a pattern for selecting
zones, if AWS breaks a zone in a particular region.  We should
probably distribute that (and the shared subnets, for shared-subnet
tests?) via leases as well, but baby steps.
@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Dec 11, 2020
@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: alvaroaleman, wking

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Dec 11, 2020
@wking
Copy link
Member Author

wking commented Dec 11, 2020

breaking-changes is suggesting:

     - args:
+        - --gcs-upload-secret=/secrets/gcs/service-account.json
         - --image-import-pull-secret=/etc/pull-secret/.dockerconfigjson
         - --report-password-file=/etc/report/password.txt
         - --report-username=ci
@@ -107,6 +112,9 @@ presubmits:
           requests:
             cpu: 10m
         volumeMounts:
+        - mountPath: /secrets/gcs
+          name: gcs-credentials
+          readOnly: true
         - mountPath: /etc/pull-secret
           name: pull-secret
           readOnly: true

and similar things I don't understand at all. Maybe it's a flake of some sort?

/retest

@wking
Copy link
Member Author

wking commented Dec 11, 2020

breaking-changes failed again the same way. Until we figure out what's going on there, hold to keep the retest bot from smashing us into the wall over and over:

/hold

@openshift-ci-robot openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 11, 2020
@hongkailiu
Copy link
Member

/test breaking-changes

@alvaroaleman
Copy link
Contributor

/retest

@wking
Copy link
Member Author

wking commented Dec 14, 2020

Been a few days since the last attempt; let's see if breaking-changes is happier now:

/retest

@wking
Copy link
Member Author

wking commented Dec 14, 2020

Green now.

/hold cancel

@openshift-ci-robot openshift-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 14, 2020
@openshift-merge-robot openshift-merge-robot merged commit 54af113 into openshift:master Dec 14, 2020
@wking wking deleted the per-region-leases branch December 14, 2020 22:17
wking added a commit to wking/openshift-release that referenced this pull request Dec 14, 2020
Like 7aa198b (ci-operator/step-registry/ipi/conf/azure: Get region
from Boskos lease, 2020-10-14, openshift#12584), but for AWS.  This sets us up
for sharding GCP by region, if we ever need that (e.g. GCP has an
outage in one region).

I'm leaving ci-operator/templates alone; hopefully those will be gone
soon.  I've already updated ci-tools with
openshift/ci-tools@00ebab17e1 (pkg/steps/clusterinstall/template: Get
region from Boskos lease, 2020-12-11, openshift/ci-tools#1527).

I still wish the end-to-end suite pulled the region out of the cluster
itself, but until it does, lean on the Infrastructure status [1] like
we've been doing for AWS since bf0a271 (AWS e2e provider should
identify zone and master and multizone, 2019-01-05, openshift#2507).

[1]: https://github.com/openshift/api/blob/164a2fb63b5f12918c439a5a0a768aa911bcad99/config/v1/types_infrastructure.go#L327-L328
wking added a commit to wking/ci-tools that referenced this pull request Dec 14, 2020
I still wish the end-to-end suite pulled the region out of the cluster
itself, but until it does, lean on the Infrastructure status [1] like
we've been doing for AWS since openshift/release@bf0a271f5c (AWS e2e
provider should identify zone and master and multizone, 2019-01-05,
openshift/release#2507).  The hard-coding I'm replacing here was fine
until the GCP region became dynamic in 00ebab1
(pkg/steps/clusterinstall/template: Get region from Boskos lease,
2020-12-11, openshift#1527).

[1]: https://github.com/openshift/api/blob/164a2fb63b5f12918c439a5a0a768aa911bcad99/config/v1/types_infrastructure.go#L327-L328
wking added a commit to wking/ci-tools that referenced this pull request Dec 15, 2020
I'd attempted to consume this as an environment variable in 00ebab1
(pkg/steps/clusterinstall/template: Get region from Boskos lease,
2020-12-11, openshift#1527), but it's failing in CI with [1]:

  2020/12/15 18:04:36 Running pod e2e-aws-upgrade
  Installing from initial release registry.svc.ci.openshift.org/ocp/release:4.5.23
  /bin/bash: line 59: LEASED_RESOURCE: unbound variable

With this commit, I'm making it a template parameter, so it will be
filled in by Go template expansion (and never be an environment
variable within the test pod).

[1]: https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-upgrade-rollback-4.5-to-4.6/1338903713728696320#1:build-log.txt%3A29
wking added a commit to wking/openshift-release that referenced this pull request Dec 15, 2020
Like 7aa198b (ci-operator/step-registry/ipi/conf/azure: Get region
from Boskos lease, 2020-10-14, openshift#12584), but for AWS.

I'm keeping a switch for AWS to give folks a pattern for selecting
zones, if AWS breaks a zone in a particular region.  We should
probably distribute that (and the shared subnets, for shared-subnet
tests?) via leases as well, but baby steps.

I'm leaving ci-operator/templates alone; hopefully those will be gone
soon.  I've already updated ci-tools with
openshift/ci-tools@00ebab17e1 (pkg/steps/clusterinstall/template: Get
region from Boskos lease, 2020-12-11, openshift/ci-tools#1527).

I'm also normalizing to uppercase shell variables, now that we are no
longer constrained by Go template expansion.  Hmm, at least that's why
I thought the variables used to be lowercase, see 43e08e7
(ci-operator/templates/openshift/installer/cluster-launch-installer-upi-e2e:
Push AWS-specific default base domain down into the template,
2019-09-23, openshift#5151).  But looking at the templates when de3de20
(step-registry: add configure and install IPI steps, 2020-01-14, openshift#6708),
I'm now not sure why these step commands were using lowercase
variable names.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants