Skip to content

Commit

Permalink
ci-operator/step-registry/ipi/conf/aws: Drop us-east-1 zone overrides
Browse files Browse the repository at this point in the history
This was originally part of avoiding broken zones, see e8921c3
(ci-operator/templates/openshift: Get e2e-aws out of us-east-1b,
2019-03-22, openshift#3204) and b717933
(ci-operator/templates/openshift/installer/cluster-launch-installer-*:
Random AWS regions for IPI, 2020-01-23, openshift#6833).  But the installer has
had broken-zone avoidence since way back in
openshift/installer@71aef620b6 (pkg/asset/machines/aws: Only return
available zones, 2019-02-07, openshift/installer#1210) I dunno how
reliably AWS sets 'state: impaired' and similar; it didn't seem to
protect us from e8921c3.  But we're getting ready to pivot to using
multiple AWS accounts, which creates two issues with hard-coding
region names in the step:

1. References by name are not stable between accounts.  From the AWS
   docs [1]:

     To ensure that resources are distributed across the Availability
     Zones for a Region, we independently map Availability Zones to
     names for each AWS account. For example, the Availability Zone
     us-east-1a for your AWS account might not be the same location as
     us-east-1a for another AWS account.

   So "aah, us-east-1a is broken, let's use b and c instead" might
   apply to one account but not the other.  And the installer does not
   currently accept zone IDs.

2. References by name may not exist in other accounts.  From the AWS
   docs [1]:

     As Availability Zones grow over time, our ability to expand them
     can become constrained. If this happens, we might restrict you
     from launching an instance in a constrained Availability Zone
     unless you already have an instance in that Availability
     Zone. Eventually, we might also remove the constrained
     Availability Zone from the list of Availability Zones for new
     accounts. Therefore, your account might have a different number
     of available Availability Zones in a Region than another account.

   And it turns out that for some reason they sometimes don't name
   sequentially, e.g. our new account lacks us-west-1a:

     $ AWS_PROFILE=ci aws --region us-west-1 ec2 describe-availability-zones | jq -r '.AvailabilityZones[] | .ZoneName + " " + .ZoneId + " " + .State' | sort
     us-west-1a usw1-az3 available
     us-west-1b usw1-az1 available
     $ AWS_PROFILE=ci-2 aws --region us-west-1 ec2 describe-availability-zones | jq -r '.AvailabilityZones[] | .ZoneName + " " + .ZoneId + " " + .State' | sort
     us-west-1b usw1-az3 available
     us-west-1c usw1-az1 available

   I have no idea why they decided to do that, but we have to work
   with the world as it is ;).

Removing the us-east-1 overrides helps reduce our exposure, although
we are still vulnerable to (2) with the a/b default line.  We'll do
something about that in follow-up work.

Leaving the "which zones?" decision up to the installer would cause it
to try to set up each available zone, and that causes more API
contention and resource consumption than we want.  Background on that
in 51c4a37 (ci-operator/templates/openshift: Explicitly set AWS
availability zones, 2019-03-28, openshift#3285) and d87fffb
(ci-operator/templates/openshift: Drop us-east-1c, 2019-04-26, openshift#3615),
as well as the rejected/rotted-out [2].

[1]: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-regions-availability-zones.html
[2]: openshift/installer#1487
  • Loading branch information
wking committed Apr 8, 2021
1 parent 871bf9a commit 05c825a
Showing 1 changed file with 1 addition and 7 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -11,13 +11,7 @@ expiration_date=$(date -d '4 hours' --iso=minutes --utc)
function join_by { local IFS="$1"; shift; echo "$*"; }

REGION="${LEASED_RESOURCE}"
case "${REGION}" in
us-east-1)
ZONES=("us-east-1b" "us-east-1c")
;;
*)
ZONES=("${REGION}a" "${REGION}b")
esac
ZONES=("${REGION}a" "${REGION}b")

ZONES_COUNT=${ZONES_COUNT:-2}
ZONES=("${ZONES[@]:0:${ZONES_COUNT}}")
Expand Down

0 comments on commit 05c825a

Please sign in to comment.