-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ci-operator/step-registry/ipi/conf/azure: Get region from Boskos lease #12584
ci-operator/step-registry/ipi/conf/azure: Get region from Boskos lease #12584
Conversation
a1e60fb
to
692452c
Compare
Is |
Based on https://steps.ci.openshift.org/help/leases#static
|
But I think the lease update and the use of these might have be in separate PRs because the rehearsals might be using existing boskos setup..
|
@wking: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
Do I have to plumb it through into the template variables? Or does something look at template pods and then inject these variables there? |
Ah, right. I've filed #12589 with just number -> region-name bumps. |
…gion With some guessing about what we support. No consumers yet, but consumer rehearsals run against the production Boskos server [1], so we need to land these before we can consume them. [1]: openshift#12584 (comment)
f2be751
to
414d098
Compare
Rebased onto master with 692452cc14 -> 414d0981af, now that #12589 has landed and updated the production Boskos server. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/hold
Release at will
cloud-credential-operator-master-e2e-azure:
So I need to do something to get the leased name out of that lease ID... |
To help avoid errors like "we randomly assigned more Azure centralus clusters than we had capacity for and they died on quota limits". From [1]: > A test may access the name of the resource that was acquired using > the ${LEASED_RESOURCE} environment variable. [1]: https://steps.ci.openshift.org/help/leases#static
414d098
to
7aa198b
Compare
Same, so not a "Boskos config was slow to go live" thing:
|
/uncc |
/retest |
Oops, I need #14262 to go live before these rehearsals will pass... |
*) echo >&2 "invalid Azure region index"; exit 1;; | ||
esac | ||
echo "Azure region: ${AZURE_REGION}" | ||
REGION="${LEASED_RESOURCE}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the leased resource is something like <region>--<some number>
as in https://github.com/openshift/release/pull/14262/files#diff-5169f2a74d1497f38a44e9adc57f6993269a89c3ddf90ab01f5d1d114ef61e58R210
i think we need to transform the lease to region here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need, we split at the --
in ci-operator since openshift/ci-tools#1306.
/retest |
network operator e2e-azure-ovn:
Dunno what that... instance-type-check?... failure is about, but looks like the lease process is working :). |
installer e2e-azure-shared-vpc:
Another "lease->region seems ok, but then the installer got confused, and I'm not sure why". |
|
@wking: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
Azure provider e2e-upgrade mostly passed, just failed some of the e2e test-cases (orthogonal to this PR). Also got a $ curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/pr-logs/pull/openshift_release/12584/rehearse-12584-pull-ci-openshift-cluster-api-provider-azure-master-e2e-upgrade/1337098998833483776/artifacts/e2e-upgrade/gather-extra/machinesets.json | jq -r '.items[].spec.template.spec.providerSpec.value.location'
westus So I think this is solid enough to land. /hold cancel Anyone comfortable dropping a :lgtm:? @alvaroaleman , you'd approved this way back? |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: alvaroaleman, wking The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@wking: Updated the following 2 configmaps:
In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Maybe this is the wrong name? We're seeing [1]: level=info msg=Credentials loaded from file "/var/run/secrets/ci.openshift.io/cluster-profile/osServicePrincipal.json" level=fatal msg=failed to fetch Master Machines: failed to load asset "Install Config": [platform.azure.region: Invalid value: "eastus1": region "eastus1" is not valid or not available for this account, compute[0].platform.azure.type: Invalid value: "Standard_D4s_v3": not found in region eastus1] but only from eastus1. And before 7aa198b (ci-operator/step-registry/ipi/conf/azure: Region from Boskos lease, 2020-10-14, openshift#12584), which landed last night, we weren't using useast1. [1]: openshift#12584 (comment)
Like openshift/release@7aa198b3c7 (ci-operator/step-registry/ipi/conf/azure: Get region from Boskos lease, 2020-10-14, openshift/release#12584), but for AWS and GCP here too, now that we have evidence that the approach is working. I'm keeping a switch for AWS to give folks a pattern for selecting zones, if AWS breaks a zone in a particular region. We should probably distribute that (and the shared subnets, for shared-subnet tests?) via leases as well, but baby steps.
Like 7aa198b (ci-operator/step-registry/ipi/conf/azure: Get region from Boskos lease, 2020-10-14, openshift#12584), but for AWS. This sets us up for sharding GCP by region, if we ever need that (e.g. GCP has an outage in one region). I'm leaving ci-operator/templates alone; hopefully those will be gone soon. I've already updated ci-tools with openshift/ci-tools@00ebab17e1 (pkg/steps/clusterinstall/template: Get region from Boskos lease, 2020-12-11, openshift/ci-tools#1527). I still wish the end-to-end suite pulled the region out of the cluster itself, but until it does, lean on the Infrastructure status [1] like we've been doing for AWS since bf0a271 (AWS e2e provider should identify zone and master and multizone, 2019-01-05, openshift#2507). [1]: https://github.com/openshift/api/blob/164a2fb63b5f12918c439a5a0a768aa911bcad99/config/v1/types_infrastructure.go#L327-L328
Like 7aa198b (ci-operator/step-registry/ipi/conf/azure: Get region from Boskos lease, 2020-10-14, openshift#12584), but for AWS. I'm keeping a switch for AWS to give folks a pattern for selecting zones, if AWS breaks a zone in a particular region. We should probably distribute that (and the shared subnets, for shared-subnet tests?) via leases as well, but baby steps. I'm leaving ci-operator/templates alone; hopefully those will be gone soon. I've already updated ci-tools with openshift/ci-tools@00ebab17e1 (pkg/steps/clusterinstall/template: Get region from Boskos lease, 2020-12-11, openshift/ci-tools#1527). I'm also normalizing to uppercase shell variables, now that we are no longer constrained by Go template expansion. Hmm, at least that's why I thought the variables used to be lowercase, see 43e08e7 (ci-operator/templates/openshift/installer/cluster-launch-installer-upi-e2e: Push AWS-specific default base domain down into the template, 2019-09-23, openshift#5151). But looking at the templates when de3de20 (step-registry: add configure and install IPI steps, 2020-01-14, openshift#6708), I'm now not sure why these step commands were using lowercase variable names.
To help avoid errors like "we randomly assigned more Azure centralus clusters than we had capacity for and they died on quota limits". Following up on #11752.