Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support multiple endpoints for cloudstack preflight check #2404

Closed
wants to merge 2 commits into from

Conversation

wongni
Copy link
Contributor

@wongni wongni commented Jun 14, 2022

Issue #, if available:

Description of changes:

Testing (if applicable):

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@eks-distro-bot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
To complete the pull request process, please assign abhinavmpandey08 after the PR has been reviewed.
You can assign the PR to them by writing /assign @abhinavmpandey08 in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@eks-distro-bot
Copy link
Collaborator

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@eks-distro-bot eks-distro-bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Jun 14, 2022
@codecov
Copy link

codecov bot commented Jun 14, 2022

Codecov Report

Merging #2404 (bb9a1a2) into main (f95e2fa) will increase coverage by 0.75%.
The diff coverage is 73.77%.

❗ Current head bb9a1a2 differs from pull request most recent head 9cb173f. Consider uploading reports for the commit 9cb173f to get more accurate results

@@            Coverage Diff             @@
##             main    #2404      +/-   ##
==========================================
+ Coverage   56.54%   57.30%   +0.75%     
==========================================
  Files         305      310       +5     
  Lines       24800    25463     +663     
==========================================
+ Hits        14023    14591     +568     
- Misses       9479     9540      +61     
- Partials     1298     1332      +34     
Impacted Files Coverage Δ
internal/pkg/api/hardware.go 0.00% <0.00%> (ø)
internal/pkg/api/tinkerbell.go 0.00% <0.00%> (ø)
internal/test/envtest/environment.go 8.47% <0.00%> (-0.15%) ⬇️
pkg/api/v1alpha1/cluster_types.go 74.42% <ø> (ø)
...g/api/v1alpha1/tinkerbelldatacenterconfig_types.go 5.26% <ø> (ø)
pkg/api/v1alpha1/tinkerbellmachineconfig_types.go 3.07% <0.00%> (-0.32%) ⬇️
pkg/api/v1alpha1/tinkerbelltemplateconfig.go 25.45% <0.00%> (ø)
pkg/cluster/client_builder.go 100.00% <ø> (ø)
pkg/cluster/fetch.go 45.00% <ø> (ø)
pkg/clusterapi/fetch.go 100.00% <ø> (ø)
... and 58 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f95e2fa...9cb173f. Read the comment docs.

Copy link
Contributor

@maxdrib maxdrib left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Thanks for putting this together. We'll be tracking the work in Add support for clusters spanning across multiple Cloudstack API Endpoints  #2406
  2. Another implementation to consider is instead of putting all the credentials/endpoints in a single instance of the cmk executable, if we should instead have multiple instances of the cmk executables, and each one would have its own config. I'm not sure which path is better, so I'll make a note of it in the design doc in Multi-endpoint design for cloudstack provider #2399 and await the team's input. Please have a look at that design when you get a chance and provide any feedback you may have

pkg/providers/cloudstack/decoder/decoder.go Outdated Show resolved Hide resolved
pkg/providers/cloudstack/decoder/decoder.go Outdated Show resolved Hide resolved
pkg/providers/cloudstack/decoder/decoder_test.go Outdated Show resolved Hide resolved
@wongni wongni force-pushed the support-multiple-acs-for-cmk branch from 948c64a to a97faea Compare June 15, 2022 19:36
@eks-distro-bot eks-distro-bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jun 15, 2022
@wongni wongni requested a review from maxdrib June 15, 2022 21:32
@wongni wongni marked this pull request as ready for review June 15, 2022 21:34
Copy link
Contributor

@maxdrib maxdrib left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you validate that this also works e2e with CAPC for a single endpoint?

@@ -48,7 +48,7 @@ type Dependencies struct {
DockerClient *executables.Docker
Kubectl *executables.Kubectl
Govc *executables.Govc
Cmk *executables.Cmk
Cmks map[string]cloudstack.ProviderCmkClient
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need to arrange this as a map instead of a list? What additional value do we get from the labeling?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Never mind, I think a map is the right move so that we can match these Profiles to an Availability Zone for preflight checking

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are we using ProviderCmkClient instead of the executables interface?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ProviderCmkClient and executables.Cmk are compliant, however, it seems that the maps of them are not compliant in Golang. When I changed Cmks' type to map[string]*executables.Cmk I got this compile error.

pkg/dependencies/factory.go:270:19: cannot use f.dependencies.Cmks (type map[string]*executables.Cmk) as type map[string]cloudstack.ProviderCmkClient in argument to cloudstack.NewProvider

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this is an OOP issue: https://stackoverflow.com/questions/46433860/golang-using-a-defined-interface-in-a-map-value. Maybe it makes sense to contain the map entirely to be inside the Cmk struct as a map of executable objects

Copy link
Contributor

@maxdrib maxdrib Jun 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concern with using ProviderCmkClient is that now we can introduce a circular dependency between dependencies -> cloudstack -> executables. The executables.Cmk interface served to decouple the two packages as I understand so I'm hesitant to break this pattern

Actually looks like we already depend on the cloudstack package here, so I guess it's ok? @vignesh-goutham do you know what the purpose of having the two separate interfaces is, between executables.Cmk and ProviderCmkClient?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like executables.Cmk is a struct, right?

Either way, in general, we want to define interfaces in the consumer, so in this case that would be in the cloudstack provide package.

What I would recommend in this case is store the executables as map[string]*executables.Cmk in the deps struct and then convert the map to map[string]cloudstack.ProviderCmkClient before injecting it in the provider.

pkg/executables/cmk_test.go Outdated Show resolved Hide resolved
pkg/providers/cloudstack/cloudstack.go Outdated Show resolved Hide resolved
@@ -375,22 +384,28 @@ func (p *cloudstackProvider) validateEnv(ctx context.Context) error {
return nil
}

func (p *cloudstackProvider) validateClusterSpec(ctx context.Context, clusterSpec *cluster.Spec) (err error) {
for _, validator := range p.validators {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it feels counterintuitive to run multiple validators all doing the same checks. From an abstraction perspective, the validator should just be invoked once. I think the implementation detail of multiple cmk executables should not leak outside of the validator itself.

pkg/providers/cloudstack/decoder/decoder.go Show resolved Hide resolved
pkg/providers/cloudstack/decoder/decoder.go Outdated Show resolved Hide resolved
@@ -48,7 +48,7 @@ type Dependencies struct {
DockerClient *executables.Docker
Kubectl *executables.Kubectl
Govc *executables.Govc
Cmk *executables.Cmk
Cmks map[string]cloudstack.ProviderCmkClient
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Never mind, I think a map is the right move so that we can match these Profiles to an Availability Zone for preflight checking

pkg/providers/cloudstack/cloudstack.go Outdated Show resolved Hide resolved
pkg/executables/cmk.go Outdated Show resolved Hide resolved
@wongni
Copy link
Contributor Author

wongni commented Jun 16, 2022

Could you validate that this also works e2e with CAPC for a single endpoint?

I was able to create a management cluster for a single endpoint.

@wongni wongni force-pushed the support-multiple-acs-for-cmk branch from e672ca4 to f9c9fba Compare June 17, 2022 19:44
@eks-distro-bot eks-distro-bot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Jun 17, 2022
@wongni wongni changed the title Support multiple endpoints for cloudstack config Support multiple endpoints for cloudstack preflight check Jun 17, 2022
@wongni wongni requested a review from maxdrib June 17, 2022 19:53
if az == nil || o == nil {
return false
}
if az.Zone.Equal(&o.Zone) &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated this logic since and added unit tests fyi

Comment on lines 77 to 79
if !foundGlobalSection {
return nil, fmt.Errorf("[Global] section not found from %s", EksacloudStackCloudConfigB64SecretKey)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need this check? It feels like Global shouldn't be required as we move towards deprecating the old schema, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed

verify-ssl ; false
api-key ; test-key1
secret-key ; test-secret1
api-url ; http://127.16.0.1:8080/client/api
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we add a newline here at the end of these files?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


for _, azName := range azNamesToCheck {
cmk := v.cmks[azName]
if cmk == nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the traditional way to check if a key is present is https://stackoverflow.com/questions/2050391/how-to-check-if-a-map-contains-a-key-in-go instead of cmk == nil

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

return fmt.Errorf("both domain and account must be specified or none of them must be specified")
func (v *Validator) generateLocalAvailabilityZones(ctx context.Context, datacenterConfig *anywherev1.CloudStackDatacenterConfig) error {
if datacenterConfig == nil {
return nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't this return error if datacenterConfig is nil?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bump

if (datacenterConfig.Spec.Domain != "" && datacenterConfig.Spec.Account == "") ||
(datacenterConfig.Spec.Domain == "" && datacenterConfig.Spec.Account != "") {
return fmt.Errorf("both domain and account must be specified or none of them must be specified")
func (v *Validator) generateLocalAvailabilityZones(ctx context.Context, datacenterConfig *anywherev1.CloudStackDatacenterConfig) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we convert to the new schema using the SetDefaults method, we won't need this function. Then we'll know that going into the validator, we only have to check the new fields, not the old ones. Although, once you change to using the new schema, you basically have to use it throughout the entire provider and controller or else everything will be broken. Maybe this is a good way to split up the code changes into more reasonable changes. If we go with that approach, the PR's will be:

  1. Add new fields to the interface
  2. This PR, which does preflight checking for both the old CloudstackDatacenterConfig schema and the new schema, but uses old CloudstackDatacenterConfig attributes when generating the CAPI/CAPC templates
  3. Third PR, which will invoke the SetDefaults method any time we're deserializing a CloudstackDatacenterConfig and then exclusively use the new fields through the provider and controller, but generating CAPI/CAPC templates for v0.4.5
  4. Fourth PR, which will support CAPC v0.5 for template generation

Does that sound reasonable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me.

Comment on lines 171 to 173
if err := v.ValidateCloudStackDatacenterConfig(ctx, cloudStackClusterSpec.CloudStackDatacenter); err != nil {
return err
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we want to do this? It'll make testing harder, and we're already calling this method from cloudstack.go. This way it's harder to tell which method is misbehaving if there's a dependency between them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing...

for _, zone := range zones {
if err = v.cmk.ValidateTemplatePresent(ctx, domainId, zone.Id, account, machineConfig.Spec.Template); err != nil {
for _, az := range v.availabilityZones {
cmk := v.cmks[az.Name]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be good to check that the cmk exists in the map again

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

cmk := v.cmks[az.Name]
if cmk.GetManagementApiEndpoint(ctx) != az.ManagementApiEndpoint {
return fmt.Errorf("cloudstack secret management url (%s) differs from cluster spec management url (%s)",
cmk.GetManagementApiEndpoint(ctx), az.ManagementApiEndpoint)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it feels weird to call a public function twice. Can we save this as a variable instead of calling it a second time?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

}
err := v.cmk.ValidateNetworkPresent(ctx, domainId, zone, zones, datacenterConfig.Spec.Account, len(zones) > 1)

zones, err := cmk.ValidateZonesPresent(ctx, []anywherev1.CloudStackZone{az.CloudStackAvailabilityZone.Zone})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems like we should refactor this to only check if a single zone is present, not multiple. That'll require less error handling below as well

@wongni wongni force-pushed the support-multiple-acs-for-cmk branch from f9c9fba to b4c5cef Compare June 20, 2022 17:10
@wongni wongni requested a review from maxdrib June 20, 2022 17:36
@wongni wongni force-pushed the support-multiple-acs-for-cmk branch from 8ddedbe to dc61600 Compare June 21, 2022 13:13
@wongni wongni force-pushed the support-multiple-acs-for-cmk branch from dc61600 to de6c32f Compare June 21, 2022 18:13
Copy link
Contributor

@maxdrib maxdrib left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like there are some conflicts to resolve. Overall, minor comments and the logic looks good to me. Nice job

pkg/executables/cmk.go Show resolved Hide resolved
pkg/providers/cloudstack/decoder/decoder.go Show resolved Hide resolved
return fmt.Errorf("both domain and account must be specified or none of them must be specified")
func (v *Validator) generateLocalAvailabilityZones(ctx context.Context, datacenterConfig *anywherev1.CloudStackDatacenterConfig) error {
if datacenterConfig == nil {
return nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bump

errAccount := v.cmk.ValidateAccountPresent(ctx, datacenterConfig.Spec.Account, domain.Id)
if errAccount != nil {
return fmt.Errorf("checking account: %v", errAccount)
if err := cmk.ValidateAccountPresent(ctx, datacenterConfig.Spec.Account, domain.Id); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bump

@@ -18,6 +18,7 @@ require (
github.com/golang/mock v1.6.0
github.com/google/go-github/v35 v35.3.0
github.com/google/uuid v1.3.0
github.com/hashicorp/go-multierror v1.1.1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, what do we need this for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check #2404 (comment)

@@ -85,7 +86,7 @@ func VsphereRmVms(ctx context.Context, clusterName string, opts ...executables.G
return govc.CleanupVms(ctx, clusterName, false)
}

func CleanUpCloudstackTestResources(ctx context.Context, clusterName string, dryRun bool) error {
func CleanUpCloudstackTestResources(ctx context.Context, clusterName string, dryRun bool) (retErr error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious, why name the returned error if there is only one?
I don't think we need that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I named the returned error to aggregate errors when cleaning up resources from multiple endpoints.

if err := cleanupCloudStackVms(ctx, cmk, clusterName, dryRun); err != nil {
retErr = multierror.Append(retErr, err)
}
cmk.Close(ctx)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes more sense to have this with a defer next to the initialization
Curious why you changed it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a cmk per endpoint. We want to close cmk after cleaning up resources from one endpoint.

for _, config := range execConfig.Profiles {
cmk := executableBuilder.BuildCmkExecutable(tmpWriter, config)
if err := cleanupCloudStackVms(ctx, cmk, clusterName, dryRun); err != nil {
retErr = multierror.Append(retErr, err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the point of aggregating errors if we are only doing it once?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we have multiple cloudstack endpoints to clean up resources we want to continue even if it encounters an error in order to clean up as many resource as possible. Multierr is useful in this case.

@@ -48,7 +48,7 @@ type Dependencies struct {
DockerClient *executables.Docker
Kubectl *executables.Kubectl
Govc *executables.Govc
Cmk *executables.Cmk
Cmks map[string]cloudstack.ProviderCmkClient
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like executables.Cmk is a struct, right?

Either way, in general, we want to define interfaces in the consumer, so in this case that would be in the cloudstack provide package.

What I would recommend in this case is store the executables as map[string]*executables.Cmk in the deps struct and then convert the map to map[string]cloudstack.ProviderCmkClient before injecting it in the provider.

@@ -273,10 +260,10 @@ func (c *Cmk) ValidateDomainPresent(ctx context.Context, domain string) (v1alpha
return domainIdentifier, nil
}

func (c *Cmk) ValidateNetworkPresent(ctx context.Context, domainId string, zone v1alpha1.CloudStackZone, zones []v1alpha1.CloudStackResourceIdentifier, account string, multipleZone bool) error {
func (c *Cmk) ValidateNetworkPresent(ctx context.Context, domainId string, network v1alpha1.CloudStackResourceIdentifier, zoneId string, account string, multipleZone bool) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious, why do we receive the zone as v1alpha1.CloudStackZone in some methods and as a string in others?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CloudStackZone is passed into ValidateZonePresent to validate zone by either ID or name. However, other resources are validated in the single resolved zone using zone ID string.

command := newCmkCommand("list networks")
if len(zone.Network.Id) > 0 {
applyCmkArgs(&command, withCloudStackId(zone.Network.Id))
if len(network.Id) > 0 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need the check now that only id is being passed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When a network ID is defined we use it to list networks. In this case, we expected to have one network returned because the ID is unique. When a network ID is not defined we list all networks in the zone and filter with the network name passed in.

return &Cmk{
writer: writer,
executable: executable,
config: config,
}
}

func (c *Cmk) GetManagementApiEndpoint(ctx context.Context) string {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't need a context here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point

@wongni wongni requested review from maxdrib and g-gaston June 24, 2022 18:39
@wongni wongni force-pushed the support-multiple-acs-for-cmk branch from bb9a1a2 to 9cb173f Compare June 27, 2022 16:22
@eks-distro-bot
Copy link
Collaborator

@wongni: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
eks-anywhere-presubmit 9cb173f link true /test eks-anywhere-presubmit
eks-anywhere-release-tooling-test-presubmit 9cb173f link true /test eks-anywhere-release-tooling-test-presubmit
eks-anywhere-attribution-files-presubmit 9cb173f link true /test eks-anywhere-attribution-files-presubmit
eks-anywhere-cluster-controller-tooling-presubmit 9cb173f link true /test eks-anywhere-cluster-controller-tooling-presubmit
eks-anywhere-docs-presubmit 9cb173f link true /test eks-anywhere-docs-presubmit
eks-anywhere-mocks-presubmits 9cb173f link true /test eks-anywhere-mocks-presubmits
eks-anywhere-e2e-presubmit 9cb173f link true /test eks-anywhere-e2e-presubmit
eks-anywhere-release-tooling-presubmit 9cb173f link true /test eks-anywhere-release-tooling-presubmit

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@wongni
Copy link
Contributor Author

wongni commented Jun 27, 2022

Created a new one based on the latest main branch: #2559

@wongni wongni closed this Jun 27, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants