Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide --cluster-name flag to the OpenStack external CCM #1619

Merged
merged 1 commit into from
Nov 15, 2021

Conversation

xmudrii
Copy link
Member

@xmudrii xmudrii commented Nov 12, 2021

What this PR does / why we need it:

The OpenStack external CCM takes the --cluster-name flag which is used as a prefix for cloud resources. Currently, we're not providing this flag, so the CCM uses the default value which is kubernetes. This can cause conflicts if there are multiple clusters in the same tenant, further causing some resources (e.g. Load Balancers) to be reused across clusters.

For example, let's look at Octavia Load Balancers. If you create a Service named nginx in the default namespace, the Octavia Load Balancer would be named as kube_service_kubernetes_default_nginx.

If you create the same service in a different cluster running in the same tenant, OpenStack will just update the Load Balancer created earlier, instead of creating a new LB or failing. The Load Balancer would now point to the new cluster until the Service in the first cluster is not reconciled again (reconciliations happen automatically periodically). Once the Service in the first/old cluster is reconciled, the Load Balancer would now point to the first/old cluster until the Service in the new cluster is not reconciled again, and so on.

This is can be quite hard to debug and detect. The solution is to provide a unique value using the --cluster-name flag. In this case, the cloud resources (e.g. Load Balancer) would be named properly.

However, you can't easily change the --cluster-name flag in a cluster that's already running. Doing that would cause OpenStack CCM to lose track of all Load Balancers, so all LBs would be recreated. That can be quite problematic and cause users' applications to go down. Therefore we need a backwards compatible approach.

The cluster name is determined in the following way:

  • if the cluster is not provisioned, or if the cluster is not an OpenStack cluster, return the KubeOne cluster name
  • if it's an existing OpenStack cluster:
    • if cluster is running in-tree cloud provider: return the KubeOne cluster name because the in-tree provider already has the --cluster-name flag set
    • if cluster is running external cloud provider: check if there is --cluster-name flag on the OpenStack CCM. If there is, read the value and return it, otherwise don't set the OpenStack cluster name, in which case it defaults to kubernetes
  • if cluster is migrated to external CCM, return the KubeOne cluster name

If the operator wants to set the --cluster-name for clusters that don't have it set, they can change the OpenStack CCM DaemonSet to set the flag manually. It's recommended to set the value to the cluster name provided in the KubeOneCluster manifest, but it's not required. When running kubeone apply, KubeOne will read the value provided to the DaemonSet and will use it when reconciling the CCM deployment. Note that doing this will cause OpenStack CCM to effectively recreate all LBs. The old LBs need to be deleted manually.

Fixes #1613

Does this PR introduce a user-facing change?:

Provide --cluster-name flag to the OpenStack external CCM (read PR description for more details)

/assign @kron4eg
cc: @qeqar

@xmudrii xmudrii requested a review from kron4eg November 12, 2021 22:16
@kubermatic-bot kubermatic-bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. dco-signoff: yes Denotes that all commits in the pull request have the valid DCO signoff message. labels Nov 12, 2021
@kubermatic-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: xmudrii

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kubermatic-bot kubermatic-bot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Nov 12, 2021
@xmudrii
Copy link
Member Author

xmudrii commented Nov 12, 2021

/cherry-pick release/v1.3

@kubermatic-bot
Copy link
Contributor

@xmudrii: once the present PR merges, I will cherry-pick it on top of release/v1.3 in a new PR and assign it to you.

In response to this:

/cherry-pick release/v1.3

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

pkg/tasks/probes.go Outdated Show resolved Hide resolved
pkg/addons/applier.go Outdated Show resolved Hide resolved
@xmudrii
Copy link
Member Author

xmudrii commented Nov 15, 2021

/hold
this needs to be properly tested

@kubermatic-bot kubermatic-bot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Nov 15, 2021
@kubermatic-bot kubermatic-bot added the lgtm Indicates that a PR is ready to be merged. label Nov 15, 2021
@kubermatic-bot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 33f2780609a175dd4244954259db8b609001549b

@kubermatic-bot kubermatic-bot removed the lgtm Indicates that a PR is ready to be merged. label Nov 15, 2021
@kubermatic-bot kubermatic-bot added the lgtm Indicates that a PR is ready to be merged. label Nov 15, 2021
@kubermatic-bot
Copy link
Contributor

LGTM label has been added.

Git tree hash: b157518036f9112f4ab105a1ebc704abda7198f7

Signed-off-by: Marko Mudrinić <mudrinic.mare@gmail.com>
@kubermatic-bot kubermatic-bot removed the lgtm Indicates that a PR is ready to be merged. label Nov 15, 2021
@xmudrii
Copy link
Member Author

xmudrii commented Nov 15, 2021

/retest

@xmudrii xmudrii requested a review from kron4eg November 15, 2021 14:14
@kubermatic-bot kubermatic-bot added the lgtm Indicates that a PR is ready to be merged. label Nov 15, 2021
@kubermatic-bot
Copy link
Contributor

LGTM label has been added.

Git tree hash: ba3c60b91a295f5f2246b74ee434947abf3b0c3d

@xmudrii
Copy link
Member Author

xmudrii commented Nov 15, 2021

/hold cancel

@kubermatic-bot kubermatic-bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 15, 2021
@kubermatic-bot kubermatic-bot merged commit 5bb3926 into master Nov 15, 2021
@kubermatic-bot kubermatic-bot added this to the KubeOne 1.4 milestone Nov 15, 2021
@kubermatic-bot kubermatic-bot deleted the os-ccm-cluster-name branch November 15, 2021 14:38
@kubermatic-bot
Copy link
Contributor

@xmudrii: #1619 failed to apply on top of branch "release/v1.3":

Applying: Provide --cluster-name flag to the OpenStack external CCM
Using index info to reconstruct a base tree...
M	addons/ccm-openstack/ccm-openstack.yaml
M	pkg/tasks/probes.go
Falling back to patching base and 3-way merge...
Auto-merging pkg/tasks/probes.go
CONFLICT (content): Merge conflict in pkg/tasks/probes.go
Auto-merging addons/ccm-openstack/ccm-openstack.yaml
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0001 Provide --cluster-name flag to the OpenStack external CCM
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

In response to this:

/cherry-pick release/v1.3

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@xmudrii xmudrii restored the os-ccm-cluster-name branch November 16, 2021 13:25
@xmudrii xmudrii deleted the os-ccm-cluster-name branch November 16, 2021 13:25
ahmedwaleedmalik pushed a commit that referenced this pull request Nov 22, 2021
Signed-off-by: Marko Mudrinić <mudrinic.mare@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. dco-signoff: yes Denotes that all commits in the pull request have the valid DCO signoff message. lgtm Indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Cluster Name not Configured in Openstack OCCM
4 participants