Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for external machine controller (driver) #460

Merged
merged 11 commits into from
Jun 8, 2020

Conversation

prashanth26
Copy link
Contributor

@prashanth26 prashanth26 commented May 4, 2020

What this PR does / why we need it:
With this PR, new providers can be added out of tree (OOT) without sending in PRs to MCM repository. This OOT follows controller patterns where the provider is expected to implement their own machine controller. However, they can make use of the template machine controller provided as a library.

A sample machine controller for AWS can be found here - https://github.com/prashanth26/machine-controller-aws/commits/spliting-controllers2


So the basic idea is that - The existing path of MCM code flow remains as it is for now. So existing clusters running the MCM would work as is today --> AWS machineClaass --> machine ---> machineSet ---> machineDeployment all managed by the same code path at MCM repo.

Anyone trying to build new providers will make use of some new libraries which is already used by the sampleProvider repo and just implement the 5 basic methods following the documentation guide.

So basically a new provider will have to follow this guide - https://github.com/prashanth26/machine-controller-manager/blob/spliting-controllers2/docs/development/cp_support_new.md. In turn they will build his own MachineController and the MachineSet and MachineDeployment controller will be used from the existing MCM code.

And the new providers will make use of the generic MachineClass instead of ProviderSpecficiMachineClass.
MachineClass ---> machine --> Managed by their MC code
MachineSet ---> MachineDeployment ---> Managed by the core current MCM code (edited)

In the future we will migrate all provider code in MCM into OOT providers and get rid of in-tree code.

Which issue(s) this PR fixes:
Fixes #178

Special notes for your reviewer:

Release note:

Support for external / OOT (Out Of Tree) machine controller. A new provider can be maintained out of the core MCM repository.

@prashanth26 prashanth26 added the kind/epic Large multi-story topic label May 4, 2020
@prashanth26 prashanth26 requested review from ggaurav10 and a team as code owners May 4, 2020 17:58
@gardener-robot-ci-3 gardener-robot-ci-3 added reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) needs/ok-to-test Needs approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels May 4, 2020
@gardener-robot-ci-3 gardener-robot-ci-3 added reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels May 10, 2020
Copy link

@amshuman-kr amshuman-kr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the PR! The overall structure looks good and also the tests. I have asked a few questions and some suggestions too.

Also, why not move the default implementation in the AWS controller into somewhere under pkg/util/provider except for the construction of the driver.Driver implementation struct?

KubeAPIBurst: 30,
LeaderElection: leaderelectionconfig.DefaultLeaderElectionConfiguration(),
ControllerStartInterval: metav1.Duration{Duration: 0 * time.Second},
ExternalMachineController: false,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there no way to do it dynamically by filtering machines, machinesets and machinesets if they refer to older typed machine classes? I.e. machines, machinesets and machinesets that refer to the generic machineclasses are filtered out because they will anyway be reconciled by the external machine controller.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we could do this. However, this would partly depend on how our migration logic would look like. For example if we continue working with machine objects the way they are today and ignore the machineClassType field to fetch the machine classes, we might need some logic like this.

But yes, I could revert this change for now and consider reconciling based on machineClassType ref in machine objects and reintroduce this change if necessary for migration. Reverting this commit.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, this has less to do with migration logic and more to do with backward compatibility. For backward compatibility, you anyway need to check the machineclass kind. Just extend it to filter out the generic machineclass.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes Amshu, this check kind of already exists today - https://github.com/prashanth26/machine-controller-manager/blob/spliting-controllers2/pkg/controller/machine_util.go#L259-L260. Just reworded the log message a little.

So the current MCM will not process the genericMachineClass ref objects with the current code as well. I forgot to mention this earlier.

Copy link

@amshuman-kr amshuman-kr May 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A minor improvement, currently the filtering is happening during reconciliation. It might be better to do the filtering before enquing, no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think that should be better. Shall make that change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have made the changes here - 61da081.

cmd/machine-controller-manager/app/options/options.go Outdated Show resolved Hide resolved
pkg/controller/deployment.go Show resolved Hide resolved
pkg/controller/machine.go Outdated Show resolved Hide resolved
pkg/controller/machineset.go Show resolved Hide resolved
pkg/util/provider/driver/driver.go Show resolved Hide resolved
pkg/util/provider/driver/driver.go Show resolved Hide resolved
pkg/util/provider/machinecontroller/machineclass.go Outdated Show resolved Hide resolved
pkg/util/provider/machinecontroller/machineclass.go Outdated Show resolved Hide resolved
pkg/util/provider/machinecontroller/machineclass.go Outdated Show resolved Hide resolved
@prashanth26
Copy link
Contributor Author

Also, why not move the default implementation in the AWS controller into somewhere under pkg/util/provider except for the construction of the driver.Driver implementation struct?

I was wondering wouldn't something like the ability of the provider to configure their own flags? - the https://github.com/prashanth26/machine-controller-aws/blob/spliting-controllers2/cmd/machine-controller/app/options/options.go#L82-L113. Wouldn't it be better for providers to be able to control this feature?

@amshuman-kr
Copy link

I was wondering wouldn't something like the ability of the provider to configure their own flags? - the https://github.com/prashanth26/machine-controller-aws/blob/spliting-controllers2/cmd/machine-controller/app/options/options.go#L82-L113. Wouldn't it be better for providers to be able to control this feature?

They could always do it like you have done right now, if they really what to support custom flags. WDYT?

@gardener-robot-ci-3 gardener-robot-ci-3 added reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels May 12, 2020
@gardener-robot-ci-1 gardener-robot-ci-1 added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label May 12, 2020
@gardener-robot-ci-2 gardener-robot-ci-2 removed the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label May 12, 2020
@prashanth26
Copy link
Contributor Author

prashanth26 commented May 12, 2020

I was wondering wouldn't something like the ability of the provider to configure their own flags? - the https://github.com/prashanth26/machine-controller-aws/blob/spliting-controllers2/cmd/machine-controller/app/options/options.go#L82-L113. Wouldn't it be better for providers to be able to control this feature?

They could always do it like you have done right now, if they really what to support custom flags. WDYT?

I guess that is true. I shall move the controller initialization also into a library package like you suggested - f074b63.

@gardener-robot-ci-2 gardener-robot-ci-2 added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label May 12, 2020
@gardener-robot-ci-3 gardener-robot-ci-3 removed the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label May 12, 2020
@gardener-robot-ci-2 gardener-robot-ci-2 added reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels May 12, 2020
@amshuman-kr
Copy link

@prashanth26 Thanks a lot for the changes. I just had another look. The PR looks almost ready!

Only real open point is the external controller flag.

@prashanth26
Copy link
Contributor Author

prashanth26 commented May 18, 2020

@prashanth26 Thanks a lot for the changes. I just had another look. The PR looks almost ready!

Okay, thank you.

Only real open point is the external controller flag.

Yes, currently I have reverted the changes and kept it to work as it was earlier. It ignores processing machine objects which have reference to any unknown machineClass kind here. This should work for now, however might need more thought once we plan to migrate objects like AWSMachineClass to MachineClass

Copy link
Contributor

@ggaurav10 ggaurav10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some minor suggestions.

pkg/util/provider/machinecodes/codes/code_string.go Outdated Show resolved Hide resolved
pkg/util/provider/machinecodes/codes/codes.go Outdated Show resolved Hide resolved
pkg/util/provider/machinecodes/status/status.go Outdated Show resolved Hide resolved
pkg/util/provider/machinecodes/status/status.go Outdated Show resolved Hide resolved
pkg/util/provider/machinecontroller/machineclass.go Outdated Show resolved Hide resolved
pkg/util/provider/machinecontroller/machineclass.go Outdated Show resolved Hide resolved
pkg/util/provider/machinecontroller/machineclass.go Outdated Show resolved Hide resolved
@gardener-robot-ci-2 gardener-robot-ci-2 removed the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label May 26, 2020
@gardener-robot-ci-1 gardener-robot-ci-1 removed the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Jun 4, 2020
@gardener-robot-ci-2 gardener-robot-ci-2 added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Jun 4, 2020
@gardener-robot-ci-3 gardener-robot-ci-3 removed the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Jun 4, 2020
@prashanth26
Copy link
Contributor Author

Hi All,

I am done with all my changes for OOT. I hope i have addressed all your concerns. Please raise any major concerns you may have, minor concerns regarding documentation could probably be addressed with a follow-up PR. We plan to positively merge this PR by EOD and make a release.

Regards,
Prashanth

prashanth26 and others added 11 commits June 7, 2020 17:51
- Added new field to MachineStatus - lastKnownState
- Relocated client builders for core & machine APIs
- Added drain as package
- Driver interface for external machine controllers
- Updated CRDs to add generic MachineClass
- Removed machineClass specific checks on MS/MD
Co-authored-by: Amshuman K R <amshuman.rao.karaya@sap.com>
Co-Authored by: Gaurav Gupta <ggaurav10>
Porting changes similar to on commit - gardener@fa7fa31
Signed-off-by: Andrey Klimentyev <andrey.klimentyev@flant.com>

Porting changes to OOT implementation
@gardener-robot-ci-2 gardener-robot-ci-2 added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Jun 7, 2020
@gardener-robot-ci-3 gardener-robot-ci-3 removed the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Jun 7, 2020
@prashanth26
Copy link
Contributor Author

Hi All,

I have just squashed a few commits for better readability. I have also merged the master with the current branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/epic Large multi-story topic needs/ok-to-test Needs approval for testing (check PR in detail before setting this label because PR is run on CI/CD) status/accepted Issue was accepted as something we need to work on
Projects
None yet
Development

Successfully merging this pull request may close these issues.

MCM Out-Of-Tree Extensibility
7 participants