Tag master machines for adoption by machine controller #479

csrwng · 2018-10-17T16:25:38Z

Add a clusterid tag to masters so they can be adopted by the machine controller in the future.

csrwng · 2018-10-17T16:26:06Z

abhinavdahiya · 2018-10-17T17:10:06Z

My idea of adoption was that we create Machine objects for each master in installer; and the ec2 instances created by terraform are identical to what controller would have done in response to the Machine object, therefore creating adoption.

wking · 2018-10-17T17:11:15Z

Are there docs for the clusterid tag? It sounds a lot like our existing kubernetes.io/cluster/${var.cluster_name} (which itself is poorly documented, but see here). Also, this fix looks AWS-specific. Should whatever tag also go into the libvirt and OpenStack setupts?

csrwng · 2018-10-17T17:25:38Z

My idea of adoption was that we create Machine objects for each master in installer

Yes, that's coming :) Just wanted to get this small change in first. The one blocking issue I have with creating the master Machine assets on install is that I don't have a good way to assign AZ's to each. My understanding is that Terraform resolves those once it is running. However, we will need to know AZ's anyway in order to support Multi-AZ machinesets for compute.

Are there docs for the clusterid tag?

Not that I know of, but it's used by the AWS actuator to determine if a machine already exists:
https://github.com/openshift/cluster-api-provider-aws/blob/master/pkg/cloud/aws/actuators/machine/utils.go#L114-L132
^^ @enxebre do you know if there's official doc for it?

Also, this fix looks AWS-specific. Should whatever tag also go into the libvirt and OpenStack setupts?

It's only needed for the AWS actuator to match an existing instance with a machine object. For libvirt, the machine name is sufficient (https://github.com/openshift/cluster-api-provider-libvirt/blob/master/cloud/libvirt/actuators/machine/utils/domain.go#L617-L633). For OpenStack, the name, image + flavor are used (https://github.com/openshift/cluster-api-provider-openstack/blob/master/cloud/openstack/machineactuator.go#L434-L452)

abhinavdahiya · 2018-10-17T17:32:09Z

Yes, that's coming :)

👍

abhinavdahiya · 2018-10-17T19:56:03Z

e2e-aws failure:

Error: Error applying plan:

6 error(s) occurred:

* module.vpc.aws_eip.nat_eip[1]: 1 error(s) occurred:

* aws_eip.nat_eip.1: Error creating EIP: AddressLimitExceeded: The maximum number of addresses has been reached.
	status code: 400, request id: 68d2f8d1-521f-45b9-8a01-1f5814bd79cf
* module.vpc.aws_eip.nat_eip[5]: 1 error(s) occurred:

* aws_eip.nat_eip.5: Error creating EIP: AddressLimitExceeded: The maximum number of addresses has been reached.
	status code: 400, request id: d32497d3-b5fb-42cc-8b15-711417c27cb4
* module.vpc.aws_eip.nat_eip[0]: 1 error(s) occurred:

* aws_eip.nat_eip.0: Error creating EIP: AddressLimitExceeded: The maximum number of addresses has been reached.
	status code: 400, request id: 94ece091-7cd6-49a5-9377-fa5a4bf7c19e
* module.vpc.aws_eip.nat_eip[3]: 1 error(s) occurred:

* aws_eip.nat_eip.3: Error creating EIP: AddressLimitExceeded: The maximum number of addresses has been reached.
	status code: 400, request id: 88ed257a-abfc-4420-b40a-9b1863e73a92
* module.vpc.aws_eip.nat_eip[2]: 1 error(s) occurred:

* aws_eip.nat_eip.2: Error creating EIP: AddressLimitExceeded: The maximum number of addresses has been reached.
	status code: 400, request id: b31bc1ab-2680-4430-b3ad-5e3a58980468
* module.vpc.aws_eip.nat_eip[4]: 1 error(s) occurred:

* aws_eip.nat_eip.4: Error creating EIP: AddressLimitExceeded: The maximum number of addresses has been reached.
	status code: 400, request id: 1e76b769-fd09-4883-be84-d13e149f51a1

Terraform does not automatically rollback in the face of errors.
Instead, your Terraform state file has been partially updated with
any resources that successfully completed. Please address the error
above and apply again to incrementally change your infrastructure

Hopefully CI cleared up fingers crossed
/retest

enxebre · 2018-10-18T07:36:28Z

@csrwng @abhinavdahiya I don't think there's docs on that tag yet, will fix that.

I don't have a good way to assign AZ's to each. My understanding is that Terraform resolves those once it is running. However, we will need to know AZ's anyway in order to support Multi-AZ machinesets for compute.

My understanding is that tf assign each master a subnet https://github.com/openshift/installer/blob/master/data/data/aws/master/main.tf#L83
which happens to be in different az https://github.com/openshift/installer/blob/master/data/data/aws/vpc/vpc-public.tf#L36-L45
So then we could just adopt machines by name

csrwng · 2018-10-18T15:07:52Z

/retest

abhinavdahiya · 2018-10-19T06:07:39Z

/test all

abhinavdahiya · 2018-10-19T06:30:27Z

/lgtm

openshift-ci-robot · 2018-10-19T06:30:32Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: abhinavdahiya, csrwng

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [abhinavdahiya]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

wking · 2018-12-04T08:33:32Z

It sounds a lot like our existing kubernetes.io/cluster/${var.cluster_name} (which itself is poorly documented, but see here).

Closing the loop on this, here is @csrwng's link from this comment pinned to a particular commit (vs. floating with master). And the difference is that kubernetes.io/cluster/{cluster-name} is per-AWS-provider via the cloud config, while sigs.k8s.io/cluster-api-cluster is per-machine via the sigs.k8s.io/cluster-api-cluster label. We started setting those labels in #468. And the clusterid tag is only used for instances, while the in-Kubernetes AWS provider only uses kubernetes.io/cluster/{cluster-name} for ELBs and route tables (kubernetes/kubernetes#41695).

The cluster-API AWS provider has also been setting kubernetes.io/cluster/ to the per-machine sigs.k8s.io/cluster-api-cluster since openshift/cluster-api-provider-aws@72b482323c (openshift/cluster-api-provider-aws#2), although I'm not clear on why it's setting both that and clusterid.

And we are setting the same sigs.k8s.io/cluster-api-cluster in each of our machine configs. That makes both approaches effectively per-cluster, and I don't see the need for two separate tags for the same purpose. I've filed openshift/cluster-api-provider-aws#119 to ask about dropping clusterid in favor of kubernetes.io/cluster/{id}. If we end up going with that, we'd be able to revert this PR.

Centralize extra-tag inclusion on aws/main.tf. This reduces the number of places we need to think about what tags should be ;). Also keep kubernetes.io/cluster/{name} localized in the aws module. See 8a37f72 (modules/aws/bootstrap: Pull AWS bootstrap setup into a module, 2018-09-05, openshift#217) for why we need to keep it on the bootstrap instance. But the bootstrap resources will be removed after the bootstrap-complete event comes through, and we don't want Kubernetes controllers trying to pick them up. This commit updates the internal Route 53 zone from KubernetesCluster to kubernetes.io/cluster/{name}: owned, catching it up to kubernetes/kubernetes@0b5ae539 (AWS: Support shared tag, 2017-02-18, kubernetes/kubernetes#41695). That tag originally landed on the zone back in 75fb49a (platforms/aws: apply tags to internal route53 zone, 2017-05-02, coreos/tectonic-installer#465). Only the master instances need the clusterid tag, as described in 6c7a5f0 (Tag master machines for adoption by machine controller, 2018-10-17, openshift#479). A number of VPC resources have moved from "shared" to "owned". The shared values are from 45dfc2b (modules/aws,azure: use the new tag format for k8s 1.6, 2017-05-04, coreos/tectonic-installer#469). The commit message doesn't have much to say for motivation, but Brad Ison said [1]: I'm not really sure if anything in Kubernetes actually uses the owned vs. shared values at the moment, but in any case, it might make more sense to mark subnets as shared. That was actually one of the main use cases for moving to this style of tagging -- being able to share subnets between clusters. But we aren't sharing these resources; see 6f55e67 (terraform/aws: remove option to use an existing vpc in aws, 2018-11-11, openshift#654). [1]: coreos/tectonic-installer#469 (comment)

Tag master machines for adoption by machine controller

6c7a5f0

openshift-ci-robot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label Oct 17, 2018

openshift-ci-robot requested review from crawford and rajatchopra October 17, 2018 16:27

openshift-ci-robot assigned abhinavdahiya Oct 19, 2018

openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Oct 19, 2018

openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 19, 2018

openshift-merge-robot merged commit 61bc533 into openshift:master Oct 19, 2018

wking mentioned this pull request Dec 4, 2018

Move to kubernetes-sigs/cluster-api-provider-aws' tagging (kuberentes.io/cluster/{id}, etc.)? openshift/cluster-api-provider-aws#119

Closed

wking mentioned this pull request Mar 13, 2019

Adding initial user doc/guide & materials for UPI AWS installation #1408

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tag master machines for adoption by machine controller #479

Tag master machines for adoption by machine controller #479

csrwng commented Oct 17, 2018

csrwng commented Oct 17, 2018

abhinavdahiya commented Oct 17, 2018

wking commented Oct 17, 2018

csrwng commented Oct 17, 2018

abhinavdahiya commented Oct 17, 2018

abhinavdahiya commented Oct 17, 2018

enxebre commented Oct 18, 2018 •

edited

Loading

csrwng commented Oct 18, 2018

abhinavdahiya commented Oct 19, 2018

abhinavdahiya commented Oct 19, 2018

openshift-ci-robot commented Oct 19, 2018

wking commented Dec 4, 2018

Tag master machines for adoption by machine controller #479

Tag master machines for adoption by machine controller #479

Conversation

csrwng commented Oct 17, 2018

csrwng commented Oct 17, 2018

abhinavdahiya commented Oct 17, 2018

wking commented Oct 17, 2018

csrwng commented Oct 17, 2018

abhinavdahiya commented Oct 17, 2018

abhinavdahiya commented Oct 17, 2018

enxebre commented Oct 18, 2018 • edited Loading

csrwng commented Oct 18, 2018

abhinavdahiya commented Oct 19, 2018

abhinavdahiya commented Oct 19, 2018

openshift-ci-robot commented Oct 19, 2018

wking commented Dec 4, 2018

enxebre commented Oct 18, 2018 •

edited

Loading