Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ci-operator: Fresh AWS shared subnets for us-east-2, etc. #6949

Merged

Conversation

wking
Copy link
Member

@wking wking commented Jan 31, 2020

ci-operator/templates/openshift/installer: Replacement shared subnets for new regions

I'd created the previous subnets in 7e38260 (#6845), but forgot to add them to the reaper whitelist. So the reaper just removed my old subnets. Recreating them here:

$ export AWS_PROFILE=ci  # or whatever you call it locally
$ git fetch origin
$ date --iso=m --utc
2020-01-30T17:09+0000
$ git checkout origin/release-4.3
$ git --no-pager log --oneline -1
2055609f9 (HEAD, origin/release-4.3) Merge pull request #2928 from ashcrow/4.3-signed-rhcos-bump

Clear out the old stacks:

for REGION in us-east-2 us-west-1 us-west-2
do
  COUNT=3
  if test us-west-1 = "${REGION}"
  then
    COUNT=2
  fi
  for INDEX in 1 2 3 4
  do
    NAME="do-not-delete-shared-vpc-${INDEX}"
    aws --region "${REGION}" cloudformation delete-stack --stack-name "${NAME}"
    aws --region "${REGION}" cloudformation wait stack-delete-complete --stack-name "${NAME}"
  done
done

I had to lean in manually and delete some instances in us-west-2's do-not-delete-shared-vpc-4 to unstick it. Then create the new subnets:

for REGION in us-east-2 us-west-1 us-west-2
do
  COUNT=3
  if test us-west-1 = "${REGION}"
  then
    COUNT=2
  fi
  for INDEX in 1 2 3 4
  do
    NAME="do-not-delete-shared-vpc-${INDEX}"
    aws --region "${REGION}" cloudformation create-stack --stack-name "${NAME}" --template-body "$(cat upi/aws/cloudformation/01_vpc.yaml)" --parameters "ParameterKey=AvailabilityZoneCount,ParameterValue=${COUNT}" >/dev/null
    aws --region "${REGION}" cloudformation wait stack-create-complete --stack-name "${NAME}"
    SUBNETS="$(aws --region "${REGION}" cloudformation describe-stacks --stack-name "${NAME}" | jq -c '[.Stacks[].Outputs[] | select(.OutputKey | endswith("SubnetIds")).OutputValue | split(",")[]]' | sed "s/\"/'/g")"
    echo "${REGION}_$((INDEX - 1))) subnets=\"${SUBNETS}\";;"
  done
done

7e38260 had a us-east-1 typo in the commit message, fixed here. I actually used us-east-2 in that commit as well, and just fumbled the copy into the old commit message. Creation spit out:

us-east-2_0) subnets="['subnet-0a568760cd74bf1d7','subnet-0320ee5b3bb78863e','subnet-015658a21d26e55b7','subnet-0c3ce64c4066f37c7','subnet-0d57b6b056e1ee8f6','subnet-0b118b86d1517483a']";;
...
us-west-2_3) subnets="['subnet-072d00dcf02ad90a6','subnet-0ad913e4bd6ff53fa','subnet-09f90e069238e4105','subnet-064ecb1b01098ff35','subnet-068d9cdd93c0c66e6','subnet-0b7d1a5a6ae1d9adf']";;

@openshift-ci-robot openshift-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Jan 31, 2020
ci-operator/templates/openshift/installer: Replacement shared subnets for new regions

I'd created the previous subnets in
7e38260 (ci-operator/templates/openshift/installer: Shared
subnets for new regions, 2020-01-23, openshift#6845), but forgot to add
them to the reaper whitelist [1].  So the reaper just removed my
old subnets.  Recreating them here:

  $ export AWS_PROFILE=ci  # or whatever you call it locally
  $ git fetch origin
  $ date --iso=m --utc
  2020-01-30T17:09+0000
  $ git checkout origin/release-4.3
  $ git --no-pager log --oneline -1
  2055609f9 (HEAD, origin/release-4.3) Merge pull request openshift#2928 from ashcrow/4.3-signed-rhcos-bump

Clear out the old stacks:

  for REGION in us-east-2 us-west-1 us-west-2
  do
    COUNT=3
    if test us-west-1 = "${REGION}"
    then
      COUNT=2
    fi
    for INDEX in 1 2 3 4
    do
      NAME="do-not-delete-shared-vpc-${INDEX}"
      aws --region "${REGION}" cloudformation delete-stack --stack-name "${NAME}"
      aws --region "${REGION}" cloudformation wait stack-delete-complete --stack-name "${NAME}"
    done
  done

I had to lean in manually and delete some instances in us-west-2's
do-not-delete-shared-vpc-4 to unstick it.  Then create the new
subnets:

  for REGION in us-east-2 us-west-1 us-west-2
  do
    COUNT=3
    if test us-west-1 = "${REGION}"
    then
      COUNT=2
    fi
    for INDEX in 1 2 3 4
    do
      NAME="do-not-delete-shared-vpc-${INDEX}"
      aws --region "${REGION}" cloudformation create-stack --stack-name "${NAME}" --template-body "$(cat upi/aws/cloudformation/01_vpc.yaml)" --parameters "ParameterKey=AvailabilityZoneCount,ParameterValue=${COUNT}" >/dev/null
      aws --region "${REGION}" cloudformation wait stack-create-complete --stack-name "${NAME}"
      SUBNETS="$(aws --region "${REGION}" cloudformation describe-stacks --stack-name "${NAME}" | jq -c '[.Stacks[].Outputs[] | select(.OutputKey | endswith("SubnetIds")).OutputValue | split(",")[]]' | sed "s/\"/'/g")"
      echo "${REGION}_$((INDEX - 1))) subnets=\"${SUBNETS}\";;"
    done
  done

7e38260 had a us-east-1 typo in the commit message, fixed
here.  I actually used us-east-2 in that commit as well, and just
fumbled the copy into the old commit message.  Creation spit out:

  us-east-2_0) subnets="['subnet-0a568760cd74bf1d7','subnet-0320ee5b3bb78863e','subnet-015658a21d26e55b7','subnet-0c3ce64c4066f37c7','subnet-0d57b6b056e1ee8f6','subnet-0b118b86d1517483a']";;
  ...
  us-west-2_3) subnets="['subnet-072d00dcf02ad90a6','subnet-0ad913e4bd6ff53fa','subnet-09f90e069238e4105','subnet-064ecb1b01098ff35','subnet-068d9cdd93c0c66e6','subnet-0b7d1a5a6ae1d9adf']";;

To generate the reaper whitelist [1], I used:

  for REGION in us-east-1 us-east-2 us-west-1 us-west-2
  do
    for INDEX in 1 2 3 4
    do
      NAME="do-not-delete-shared-vpc-${INDEX}"
      aws --region "${REGION}" resourcegroupstaggingapi get-resources --tag-filters "Key=aws:cloudformation:stack-name,Values=${NAME}" --query 'ResourceTagMappingList[].ResourceARN' | jq -r ".[] | split(\":\")[-1] | \"                '\" + . + \"',  # CI exclusion per DPP-4108, ${REGION} ${NAME}\""
    done
  done | sort

followed by some whitespace shuffling to get the comments aligned.

[1]: openshift/li#3634
     Private repository, sorry external folks.
@openshift-ci-robot
Copy link
Contributor

@wking: The following tests failed, say /retest to rerun all failed tests:

Test name Commit Details Rerun command
ci/rehearse/openshift/cloud-credential-operator/master/e2e-gcp 1b21187 link /test pj-rehearse
ci/rehearse/openshift/cluster-api-actuator-pkg/master/e2e-azure-operator 1b21187 link /test pj-rehearse
ci/rehearse/openshift/cloud-credential-operator/master/e2e-azure 1b21187 link /test pj-rehearse
ci/prow/pj-rehearse 1b21187 link /test pj-rehearse

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Jan 31, 2020
@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: stevekuznetsov, wking

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-robot openshift-merge-robot merged commit d9e847b into openshift:master Jan 31, 2020
@openshift-ci-robot
Copy link
Contributor

@wking: Updated the following 7 configmaps:

  • prow-job-cluster-launch-installer-src configmap in namespace ci at cluster ci/api-build01-ci-devcluster-openshift-com:6443 using the following files:
    • key cluster-launch-installer-src.yaml using file ci-operator/templates/openshift/installer/cluster-launch-installer-src.yaml
  • step-registry configmap in namespace ci at cluster default using the following files:
    • key ipi-conf-commands.sh using file ci-operator/step-registry/ipi/conf/ipi-conf-commands.sh
  • prow-job-cluster-launch-installer-e2e configmap in namespace ci at cluster ci/api-build01-ci-devcluster-openshift-com:6443 using the following files:
    • key cluster-launch-installer-e2e.yaml using file ci-operator/templates/openshift/installer/cluster-launch-installer-e2e.yaml
  • prow-job-cluster-launch-installer-e2e configmap in namespace ci at cluster default using the following files:
    • key cluster-launch-installer-e2e.yaml using file ci-operator/templates/openshift/installer/cluster-launch-installer-e2e.yaml
  • prow-job-cluster-launch-installer-e2e configmap in namespace ci-stg at cluster default using the following files:
    • key cluster-launch-installer-e2e.yaml using file ci-operator/templates/openshift/installer/cluster-launch-installer-e2e.yaml
  • prow-job-cluster-launch-installer-src configmap in namespace ci at cluster default using the following files:
    • key cluster-launch-installer-src.yaml using file ci-operator/templates/openshift/installer/cluster-launch-installer-src.yaml
  • prow-job-cluster-launch-installer-src configmap in namespace ci-stg at cluster default using the following files:
    • key cluster-launch-installer-src.yaml using file ci-operator/templates/openshift/installer/cluster-launch-installer-src.yaml

In response to this:

ci-operator/templates/openshift/installer: Replacement shared subnets for new regions

I'd created the previous subnets in 7e38260 (#6845), but forgot to add them to the reaper whitelist. So the reaper just removed my old subnets. Recreating them here:

$ export AWS_PROFILE=ci  # or whatever you call it locally
$ git fetch origin
$ date --iso=m --utc
2020-01-30T17:09+0000
$ git checkout origin/release-4.3
$ git --no-pager log --oneline -1
2055609f9 (HEAD, origin/release-4.3) Merge pull request #2928 from ashcrow/4.3-signed-rhcos-bump

Clear out the old stacks:

for REGION in us-east-2 us-west-1 us-west-2
do
 COUNT=3
 if test us-west-1 = "${REGION}"
 then
   COUNT=2
 fi
 for INDEX in 1 2 3 4
 do
   NAME="do-not-delete-shared-vpc-${INDEX}"
   aws --region "${REGION}" cloudformation delete-stack --stack-name "${NAME}"
   aws --region "${REGION}" cloudformation wait stack-delete-complete --stack-name "${NAME}"
 done
done

I had to lean in manually and delete some instances in us-west-2's do-not-delete-shared-vpc-4 to unstick it. Then create the new subnets:

for REGION in us-east-2 us-west-1 us-west-2
do
 COUNT=3
 if test us-west-1 = "${REGION}"
 then
   COUNT=2
 fi
 for INDEX in 1 2 3 4
 do
   NAME="do-not-delete-shared-vpc-${INDEX}"
   aws --region "${REGION}" cloudformation create-stack --stack-name "${NAME}" --template-body "$(cat upi/aws/cloudformation/01_vpc.yaml)" --parameters "ParameterKey=AvailabilityZoneCount,ParameterValue=${COUNT}" >/dev/null
   aws --region "${REGION}" cloudformation wait stack-create-complete --stack-name "${NAME}"
   SUBNETS="$(aws --region "${REGION}" cloudformation describe-stacks --stack-name "${NAME}" | jq -c '[.Stacks[].Outputs[] | select(.OutputKey | endswith("SubnetIds")).OutputValue | split(",")[]]' | sed "s/\"/'/g")"
   echo "${REGION}_$((INDEX - 1))) subnets=\"${SUBNETS}\";;"
 done
done

7e38260 had a us-east-1 typo in the commit message, fixed here. I actually used us-east-2 in that commit as well, and just fumbled the copy into the old commit message. Creation spit out:

us-east-2_0) subnets="['subnet-0a568760cd74bf1d7','subnet-0320ee5b3bb78863e','subnet-015658a21d26e55b7','subnet-0c3ce64c4066f37c7','subnet-0d57b6b056e1ee8f6','subnet-0b118b86d1517483a']";;
...
us-west-2_3) subnets="['subnet-072d00dcf02ad90a6','subnet-0ad913e4bd6ff53fa','subnet-09f90e069238e4105','subnet-064ecb1b01098ff35','subnet-068d9cdd93c0c66e6','subnet-0b7d1a5a6ae1d9adf']";;

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@wking wking deleted the restore-aws-shared-subnets branch February 16, 2020 16:53
wking added a commit to wking/openshift-release that referenced this pull request Sep 17, 2020
…_yaml: Add EC2 endpoint

The machine-API currently ignores the proxy configuration, although
future machine-API might grow support for it [1].  That means CI jobs
in the blackhole VPC die on i/o timeouts trying to reach
https://ec2.${region}.amazonaws.com/ while provisioning compute
machines, and the install subsequently dies because we fail to
schedule monitoring, ingress, and other compute-hosted workloads [2].
This commit adds a VPC endpoint to allow EC2 access from inside the
cluster [3].  It's similar to the existing S3 VPC endpoint, but:

* It's an interface type, while S3 needs the older gateway type.  This
  avoids:

    Endpoint type (Gateway) does not match available service types
    ([Interface]). (Service: AmazonEC2; Status Code: 400; Error Code:
    InvalidParameter; Request ID: ...; Proxy: null)

  while creating the stack.

* There are no RouteTableIds, because the interface type does not
  support them.  This avoids:

    Route table IDs are only supported for Gateway type VPC
    Endpoint. (Service: AmazonEC2; Status Code: 400; Error Code:
    InvalidParameter; Request ID: ...; Proxy: null)

  while creating the stack.

* I've created a new security group allowing HTTPS connections to the
  endpoint, because SecurityGroupIds is required for interface
  endpoints [3].  I've also placed the network interfaces in the
  public subnets, because SubnetIds is requried for interface
  endpoints [3].

* I've set PrivateDnsEnabled [3] so the machine-API operator doesn't
  have to do anything special to get DNS routing it towards the
  endpoint interfaces.

Rolled out to the CI account following 9b39dd2 (Creating private
subnets without direct external internet access and updating proxy e2e
to use this instead, 2020-07-20, openshift#10355):

  for REGION in us-east-1 us-east-2 us-west-1 us-west-2
  do
    COUNT=3
    if test us-west-1 = "${REGION}"
    then
      COUNT=2
    fi
    for INDEX in 1
    do
      NAME="do-not-delete-shared-vpc-blackhole-${INDEX}"
      aws --region "${REGION}" cloudformation update-stack --stack-name "${NAME}" --template-body "$(cat ci-operator/step-registry/ipi/conf/aws/blackholenetwork/blackhole_vpc_yaml.md)" --parameters "ParameterKey=AvailabilityZoneCount,ParameterValue=${COUNT}" >/dev/null
      aws --region "${REGION}" cloudformation wait stack-update-complete --stack-name "${NAME}"
      SUBNETS="$(aws --region "${REGION}" cloudformation describe-stacks --stack-name "${NAME}" | jq -c '[.Stacks[].Outputs[] | select(.OutputKey | endswith("SubnetIds")).OutputValue | split(",")[]]' | sed "s/\"/'/g")"
      echo "${REGION}_$((INDEX - 1))) subnets=\"${SUBNETS}\";;"
    done
  done

We could also have deleted the previous stacks, used 'create-stack'
instead of 'update-stack', and used 'stack-create-complete' instead of
'stack-update-complete'.

Unsurprisingly, since we were not updating the subnets themselves, the
output has not changed:

  us-east-1_0) subnets="['subnet-0a7491aa76f9b88d7','subnet-0f0b2dcccdcbc7c1d','subnet-0680badf68cbf198c','subnet-02b25dd65f806e41b','subnet-010235a3bff34cf6f','subnet-085c78d8c562b5a51']";;
  us-east-2_0) subnets="['subnet-0ea117d9499ef624f','subnet-00adc83d4719d4176','subnet-0b9399990fa424d7f','subnet-060d997b25f5bb922','subnet-015f4e65b0ef1b0e1','subnet-02296b47817923bfb']";;
  us-west-1_0) subnets="['subnet-0d003f08a541855a2','subnet-04007c47f50891b1d','subnet-02cdb70a3a4beb754','subnet-0d813eca318034290']";;
  us-west-2_0) subnets="['subnet-05d8f8ae35e720611','subnet-0f3f254b13d40e352','subnet-0e23da17ea081d614','subnet-0f380906f83c55df7','subnet-0a2c5167d94c1a5f8','subnet-01375df3b11699b77']";;

so no need to update ipi-conf-aws-blackholenetwork-commands.sh.

I generated the reaper keep-list following 1b21187 (ci-operator:
Fresh AWS shared subnets for us-east-2, etc., 2020-01-30, openshift#6949):

  for REGION in us-east-1 us-east-2 us-west-1 us-west-2
  do
    for INDEX in 1
    do
      NAME="do-not-delete-shared-vpc-blackhole-${INDEX}"
      aws --region "${REGION}" resourcegroupstaggingapi get-resources --tag-filters "Key=aws:cloudformation:stack-name,Values=${NAME}" --query 'ResourceTagMappingList[].ResourceARN[]' | jq -r ".[] | . + \"  # CI exclusion per DPP-5789, ${REGION} ${NAME}\""
    done
  done | sort

and passed that along to the Developer Productivity Platform (DPP)
folks so they can update their reaper config.

[1]: https://bugzilla.redhat.com/show_bug.cgi?id=1769223
[2]: https://bugzilla.redhat.com/show_bug.cgi?id=1875773
[3]: https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-ec2-vpcendpoint.html
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants