Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ enable BYO IP for control plane #652

Closed
wants to merge 1 commit into from

Conversation

alexeldeib
Copy link
Contributor

@alexeldeib alexeldeib commented May 28, 2020

What this PR does / why we need it:
Enables users to bring their own IP for the control plane. This allows using a CAPI control plane with self-managed nodes by setting advertise address on KCP at create-time.

Defaults to previous behavior if unset. Updating this field is invalid.

Special notes for your reviewer:
Demo for Azure edge scenarios

Release note:

Enable users to pre-provision the control plane IP so it may be used e.g. in advertise address.

@k8s-ci-robot
Copy link
Contributor

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels May 28, 2020
@k8s-ci-robot k8s-ci-robot requested review from devigned and vincepri May 28, 2020 02:09
@k8s-ci-robot k8s-ci-robot added area/provider/azure Issues or PRs related to azure provider sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. labels May 28, 2020
@alexeldeib
Copy link
Contributor Author

/test all

@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label May 28, 2020
@alexeldeib
Copy link
Contributor Author

/test all

@alexeldeib
Copy link
Contributor Author

/test pull-cluster-api-provider-azure-e2e

@devigned
Copy link
Contributor

I think BYOR (bring your own resource) is common across almost all resources. What do folks think about allowing the ProviderID to be set by a user, and the controller would use the ProviderID being set on an unowned resources as the indication the resource should be a BYO?

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 4, 2020
@MirzaSikander
Copy link

Can we get this pull request to completion? Need this functionality to move forward with our deployment.

@alexeldeib
Copy link
Contributor Author

I think at the very least, this PR needs to add a concept of ownership so that CAPZ doesn't delete IPs brought by the user and does not fail to cleanup IPs it creates.

We've discussed this as a general issue a bit, @devigned @CecileRobertMichon any thoughts on how you think it should look for this particular use case?

@MirzaSikander feel free to take over and drive this PR if you need the functionality sooner as mentioned offline

@MirzaSikander
Copy link

MirzaSikander commented Jun 8, 2020

For us, deleting the IP address is actually preferred so its not an issue for us. We can always extend it later to optionally delete it.

@devigned
Copy link
Contributor

devigned commented Jun 9, 2020

Could we tag managed pubIPs with "sigs.k8s.io_cluster-api-provider-azure_cluster_test-cluster": "owned" (similar ownership indicator), and just check for that upon delete? That seems like this would deal with the ownership problem for this instance.

In the longer term, I don't know that we want to use only resource name as the way we look up resources. Using resource ID would be more precise. Moreover, I'm worried that if we pull this in quickly, then decide on a generic way of accepting "bring your own" resources, then this could be a painful migration in the next api version.

@CecileRobertMichon
Copy link
Contributor

/cc @vincepri @detiber is this a use case that has come up for other CAPI providers?

@MirzaSikander
Copy link

Just wanted to provide our context and use case for this feature.

We are team in Azure that is responsible for managing bare-metal infra in locations external to Azure DC aka the edge. We are in the process of layering k8s on top of our bare-metal infra to allow containerized apps to be deployed and managed. Our current topology involves the control plane deployed in Azure managing worker nodes on a private network in the edge. The master has a public IP address through which the workers are able to join the cluster. We wanted to transition over to CAPZ to avoid manually managing these clusters (Only the control plane as of now).

For things to work properly we need to specify the advertise-address on the API-server

kubeadmConfigSpec:
    clusterConfiguration:
      apiServer:
        timeoutForControlPlane: 20m
        extraArgs:
          advertise-address: ${AZURE_PUBLIC_IP_ADDRESS}

Currently there is no way CAPI/KCP to dynamically do this. (If there is maybe we can try that approach)

With this pull request, we can sort of get around that by pre-provisioning a static IP address and populating the cluster configuration.

@devigned
Copy link
Contributor

Ok... This has been hanging around for a little while. It need to come to a conclusion.

I don't love having name required on IPAddress. What if we leave the IPAddress type as it was, and just check with ARM if the IPAddress exists if name is set? If it does exist, we use it, else create it and tag it with CAPZ managed tag. On delete, we would not delete the IPAddress if it is tagged with the CAPZ managed tag.

Thoughts?

@alexeldeib
Copy link
Contributor Author

Yep, I think that approach makes sense. This is my next task up, I got a little distracted by a flurry of work on disks again -_- Thankfully all finished with that

@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 28, 2020
@alexeldeib alexeldeib marked this pull request as ready for review June 28, 2020 10:01
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 28, 2020
if err != nil {
return errors.Wrapf(err, "failed to delete public ip %s in resource group %s", publicIPSpec.Name, s.Scope.ResourceGroup())
}

klog.V(2).Infof("deleted public ip %s", publicIPSpec.Name)
return err
}

func (s *Service) isManaged(ctx context.Context, name string) (bool, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @cpanato

This is exactly the change you need for #708

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will review and follow the approach! thanks!

@CecileRobertMichon
Copy link
Contributor

@alexeldeib this will need a pretty big rebase since #716 merged... sorry about that
Let me know if you want help, I'm happy to pair on applying this change to the new code organization.

@k8s-ci-robot k8s-ci-robot removed the release-note-action-required Denotes a PR that introduces potentially breaking changes that require user action. label Jul 15, 2020
@alexeldeib alexeldeib changed the title ⚠ enable BYO IP for control plane ✨ enable BYO IP for control plane Jul 15, 2020
@alexeldeib
Copy link
Contributor Author

@CecileRobertMichon if it's fine with you i'll leave the validation as is. I updated the PR title and description since this implementation is no longer breaking. lmk if you have additional comments?

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
To complete the pull request process, please assign cecilerobertmichon
You can assign the PR to them by writing /assign @cecilerobertmichon in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@alexeldeib
Copy link
Contributor Author

/test pull-cluster-api-provider-azure-test

@alexeldeib
Copy link
Contributor Author

/test pull-cluster-api-provider-azure-test

@alexeldeib alexeldeib force-pushed the ace/ip branch 2 times, most recently from 9c1982a to 3213055 Compare July 15, 2020 17:52
err := s.Client.Delete(ctx, s.Scope.ResourceGroup(), ip.Name)
if err != nil && azure.ResourceNotFound(err) {
// already deleted
return nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch 😬

Copy link
Contributor

@CecileRobertMichon CecileRobertMichon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one last comment about the FQDN, otherwise lgtm

Signed-off-by: Alexander Eldeib <alexeldeib@gmail.com>
@alexeldeib
Copy link
Contributor Author

@MirzaSikander I know this ended up taking quite a bit longer than anticipated, would appreciate if you have a chance to take a quick look at this for your scenario.

@CecileRobertMichon force pushed but added DNS name and pass it through directly

@alexeldeib
Copy link
Contributor Author

/test pull-cluster-api-provider-azure-e2e

@CecileRobertMichon
Copy link
Contributor

/assign @MirzaSikander

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 22, 2020
@k8s-ci-robot
Copy link
Contributor

@alexeldeib: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@CecileRobertMichon
Copy link
Contributor

@k8s-ci-robot
Copy link
Contributor

@alexeldeib: The following test failed, say /retest to rerun all failed tests:

Test name Commit Details Rerun command
pull-cluster-api-provider-azure-e2e fd0b16e link /test pull-cluster-api-provider-azure-e2e

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@CecileRobertMichon
Copy link
Contributor

/close

closing in favor of #974

@k8s-ci-robot
Copy link
Contributor

@CecileRobertMichon: Closed this PR.

In response to this:

/close

closing in favor of #974

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/provider/azure Issues or PRs related to azure provider cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants