Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Cloudstack multiple endpoint for preflight checks #2559

Merged
merged 25 commits into from
Jul 6, 2022

Conversation

wongni
Copy link
Contributor

@wongni wongni commented Jun 27, 2022

Issue #, if available: #2406

Description of changes:
Cloudstack provider will support multiple endpoints, therefore eksa preflight check also needs to support them.

Testing (if applicable):

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@eks-distro-bot
Copy link
Collaborator

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@wongni wongni requested a review from maxdrib June 27, 2022 16:39
@eks-distro-bot eks-distro-bot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Jun 27, 2022
@codecov
Copy link

codecov bot commented Jun 27, 2022

Codecov Report

Merging #2559 (222889c) into main (ffdff4b) will increase coverage by 0.36%.
The diff coverage is 82.90%.

@@            Coverage Diff             @@
##             main    #2559      +/-   ##
==========================================
+ Coverage   57.49%   57.85%   +0.36%     
==========================================
  Files         310      310              
  Lines       25562    25555       -7     
==========================================
+ Hits        14696    14786      +90     
+ Misses       9529     9453      -76     
+ Partials     1337     1316      -21     
Impacted Files Coverage Δ
pkg/executables/builder.go 0.00% <0.00%> (ø)
pkg/executables/cmk_cmd_builder.go 100.00% <ø> (+8.33%) ⬆️
pkg/providers/cloudstack/validator.go 68.91% <69.04%> (+6.79%) ⬆️
pkg/providers/cloudstack/cloudstack.go 66.63% <80.76%> (+3.43%) ⬆️
pkg/executables/cmk.go 83.42% <91.76%> (+10.18%) ⬆️
pkg/dependencies/factory.go 61.13% <100.00%> (+1.53%) ⬆️
pkg/providers/cloudstack/decoder/decoder.go 100.00% <100.00%> (ø)
... and 2 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ffdff4b...222889c. Read the comment docs.

return &Validator{
cmk: cmk,
cmks: cmks,
availabilityZones: []localAvailabilityZone{},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
availabilityZones: []localAvailabilityZone{},
localAvailabilityZones: []localAvailabilityZone{},

I think we should keep the local prefix whenever we can to avoid confusion and clearly indicate we're not using the actual spec availabilityZones.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good

if err := cleanupCloudStackVms(ctx, cmk, clusterName, dryRun); err != nil {
retErr = multierror.Append(retErr, err)
}
cmk.Close(ctx)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we defer this close? or does that not work in loops?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think to make defer to work I should pull a function that processes one cmk.

@@ -266,12 +266,17 @@ func (f *Factory) WithProvider(clusterConfigFile string, clusterConfig *v1alpha1
return fmt.Errorf("unable to get machine config from file %s: %v", clusterConfigFile, err)
}

cmkClientMap := cloudstack.CmkClientMap{}
for name, cmk := range f.dependencies.Cmks {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you add a comment explaining this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -414,16 +419,21 @@ func (f *Factory) WithCmk() *Factory {
f.WithExecutableBuilder().WithWriter()

f.buildSteps = append(f.buildSteps, func(ctx context.Context) error {
if f.dependencies.Cmk != nil {
if f.dependencies.Cmks != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we also check len(f.dependencies.Cmks) > 0?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

if len(zone.Network.Id) > 0 {
applyCmkArgs(&command, withCloudStackId(zone.Network.Id))
if len(network.Id) > 0 {
applyCmkArgs(&command, withCloudStackId(network.Id))
}
if multipleZone {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we still need this parameter? I think we could get rid of it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like multipleZone is still here

}

for _, azName := range azNamesToCheck {
cmk, ok := v.cmks[azName]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should use the AZ's credentialRef not the AZ's name to look up the cmk mapping

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point!

return fmt.Errorf("zone network is not set or is empty")
cmk, ok := v.cmks[az.CredentialsRef]
if !ok {
return fmt.Errorf("cannot find CloudStack profile for availability zone %s", az.CredentialsRef)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the error message here is confusing.

Suggested change
return fmt.Errorf("cannot find CloudStack profile for availability zone %s", az.CredentialsRef)
return fmt.Errorf("cannot find CloudStack profile named %s for availability zone %s", az.CredentialsRef, az.Name)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay

Domain: datacenterConfig.Spec.Domain,
Account: datacenterConfig.Spec.Account,
ManagementApiEndpoint: datacenterConfig.Spec.ManagementApiEndpoint,
Zone: zone,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems to be missing a name, doesn't it? I think we need the name too for some of the error messages

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding Name as SetDefaults does

v.availabilityZones = append(v.availabilityZones, availabilityZone)
}
}
for _, az := range datacenterConfig.Spec.AvailabilityZones {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need a check for nil before iterating over the items in an array?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need, I think. AvailablityZones is []CloudStackAvailabilityZone and it will be initialized as an empty list.

if (datacenterConfig.Spec.Domain != "" && datacenterConfig.Spec.Account == "") ||
(datacenterConfig.Spec.Domain == "" && datacenterConfig.Spec.Account != "") {
return fmt.Errorf("both domain and account must be specified or none of them must be specified")
func (v *Validator) generateLocalAvailabilityZones(ctx context.Context, datacenterConfig *anywherev1.CloudStackDatacenterConfig) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems like we're doing two things in this function - generating local AZ's, and also validating their accounts/domains. I'd suggest simplifying this to just generate the AZ's, and do the validation of them in ValidateCloudStackDatacenterConfig

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good

@wongni wongni requested a review from maxdrib June 28, 2022 19:04
@@ -86,22 +86,33 @@ func (v *Validator) ValidateCloudStackDatacenterConfig(ctx context.Context, data
return err
}

for _, az := range v.availabilityZones {
for _, az := range v.localAvailabilityZones {
fmt.Printf("az: %+v\n", az.CloudStackAvailabilityZone)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fmt.Printf("az: %+v\n", az.CloudStackAvailabilityZone)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops. Removed.

@@ -0,0 +1,1646 @@
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need this file

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed

@wongni wongni requested a review from maxdrib June 28, 2022 19:32
@wongni wongni marked this pull request as ready for review June 28, 2022 19:32
Copy link
Contributor

@maxdrib maxdrib left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just need some more test coverage and we're in business, great work :)

for _, config := range execConfig.Profiles {
cmk := executableBuilder.BuildCmkExecutable(tmpWriter, config)
if err := cleanupCloudStackVms(ctx, cmk, clusterName, dryRun); err != nil {
retErr = multierror.Append(retErr, err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this error look? Just wondering if we need to import a new library or solve it a different way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The multierror module is useful when we want to continue to clean up the resources on encountering an error so that we can clean up as many resources as possible. The appended errors are integrated into a single error. If you think introducing a new module just for testing isn't a good idea I can remove multierror module from this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my vote would be removing this third party module

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing it

@@ -50,7 +50,7 @@ type Dependencies struct {
DockerClient *executables.Docker
Kubectl *executables.Kubectl
Govc *executables.Govc
Cmk *executables.Cmk
Cmks map[string]*executables.Cmk
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to build a specific binary for each of the configs? Can't this just be solved by setting the appropriate envvar per command execution somehow?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cmk is an object combining cmk binary and cloud-config to a certain ACS endpoint. The binary is the same across the list of Cmks but the configurations are different. This way we could keep the most of Cmk.go code as it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can cmk exec takes profile for each cmd? So that we do not need to preset the configuration and have a map of exact same exec.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to use single cmk instead of multiple cmk objects

}
return zoneIdentifiers, nil
return cmkZones[0].Id, nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the purpose of returning the string here now? It seems like the id will be the same as what would be passed in?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The input argument in this function is the name of a zone. This function resolves the name into the ID while validating the zone's presence by name.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It sounds like this would be better named ValidateAndGetZoneId especially because the other Validate methods only return an error, but yea I get it now.

@wongni wongni requested a review from vivek-koppuru June 29, 2022 15:13
for _, config := range execConfig.Profiles {
cmk := executableBuilder.BuildCmkExecutable(tmpWriter, config)
if err := cleanupCloudStackVms(ctx, cmk, clusterName, dryRun); err != nil {
retErr = multierror.Append(retErr, err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my vote would be removing this third party module


func cleanupCloudStackVms(ctx context.Context, cmk *executables.Cmk, clusterName string, dryRun bool) error {
if err := cmk.ValidateCloudStackConnection(ctx); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to validate cloudstack connection? the cleanupVms will fail w same connection error anyway if there's connection issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing ValidateCloudStackConnection

@@ -50,7 +50,7 @@ type Dependencies struct {
DockerClient *executables.Docker
Kubectl *executables.Kubectl
Govc *executables.Govc
Cmk *executables.Cmk
Cmks map[string]*executables.Cmk
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can cmk exec takes profile for each cmd? So that we do not need to preset the configuration and have a map of exact same exec.

writer,
now,
skipIpCheck,
)
}

func NewProviderCustomNet(datacenterConfig *v1alpha1.CloudStackDatacenterConfig, machineConfigs map[string]*v1alpha1.CloudStackMachineConfig, clusterConfig *v1alpha1.Cluster, providerKubectlClient ProviderKubectlClient, providerCmkClient ProviderCmkClient, writer filewriter.FileWriter, now types.NowFunc, skipIpCheck bool) *cloudstackProvider {
func NewProviderCustomNet(datacenterConfig *v1alpha1.CloudStackDatacenterConfig, machineConfigs map[string]*v1alpha1.CloudStackMachineConfig, clusterConfig *v1alpha1.Cluster, providerKubectlClient ProviderKubectlClient, providerCmkClients CmkClientMap, writer filewriter.FileWriter, now types.NowFunc, skipIpCheck bool) *cloudstackProvider {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this method?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not needed. Removing NewProviderCustomNet


apiKey, err := section.GetKey("api-key")
if err != nil {
return nil, fmt.Errorf("failed to extract value of 'api-key' from %s: %v", section.Name(), err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
return nil, fmt.Errorf("failed to extract value of 'api-key' from %s: %v", section.Name(), err)
return nil, fmt.Errorf("extracting value of 'api-key' from %s: %v", section.Name(), err)

same for below

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updating

return nil, fmt.Errorf("failed to extract value of 'api-url' from %s: %v", EksacloudStackCloudConfigB64SecretKey, err)
}
verifySslValue := "true"
if verifySsl, err := section.GetKey("verify-ssl"); err == nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about err != nil?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If err == nil then the default value of "true" will be used.

if err != nil {
return nil, fmt.Errorf("failed to extract value of 'api-url' from %s: %v", EksacloudStackCloudConfigB64SecretKey, err)
}
verifySslValue := "true"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the default verifySsl is true?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a constant then so we can change the default easily?

return &Validator{
cmk: cmk,
cmks: cmks,
localAvailabilityZones: []localAvailabilityZone{},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you should have localAvailabilityZones in the validator struct if you are generating it from one of the validation method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point

@wongni wongni requested review from jiayiwang7 and maxdrib July 1, 2022 17:38
@g-gaston g-gaston self-requested a review July 1, 2022 21:48
}

return cmk.CleanupVms(ctx, clusterName, dryRun)
if len(failedProfiles) > 0 {
return fmt.Errorf("cleaning up VMs: profiles=%+v, errors=%+v", failedProfiles, errors)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we be doing a profile=error mapping instead if that is the intention here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated. The new code will return an error with this kind of message.

cleaning up VMs: map[acs#1:Unknown exception acs#2:Unable to reach]

}
return zoneIdentifiers, nil
return cmkZones[0].Id, nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It sounds like this would be better named ValidateAndGetZoneId especially because the other Validate methods only return an error, but yea I get it now.


for _, instance := range execConfig.Profiles {
if err := p.validateManagementApiEndpoint(instance.ManagementUrl); err != nil {
return fmt.Errorf("CloudStack instance %s's managementApiEndpoint %s is invalid",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to print the error message here too?

if err != nil {
return nil, fmt.Errorf("failed to extract value of 'api-url' from %s: %v", EksacloudStackCloudConfigB64SecretKey, err)
}
verifySslValue := "true"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a constant then so we can change the default easily?


logger.MarkPass("Connected to", "servers", refNamesToCheck)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you paste an example here for reference?

wongni added 3 commits July 5, 2022 10:43
- ValidateZonePresent -> ValidateZoneAndGetId
- ValidateDomainPresent -> ValidateDomainAndGetId
Change return value of ValidateDomainAndGetId
- from CloudStackResourceIdentifier to string because only domainId is used
}

err = p.validateMachineConfigsNameUniqueness(ctx, cluster, clusterSpec)
if err != nil {
if err := p.validateMachineConfigsNameUniqueness(ctx, cluster, clusterSpec); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't this be part of validateClusterSpec?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just a code clean up change.
from

err = func1()
if err != nil {}

to

if err := func1(); err != nil {}

If you think validateMachineConfigsNameUniqueness is a part of validateClusterSpec then I should be handled in a separate PR because this pattern shows up in vsphere provider as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think vsphere provider code is a good reference. But yea separate pr sounds good to me.

@wongni wongni requested a review from vivek-koppuru July 5, 2022 23:52
@vivek-koppuru
Copy link
Member

/approve

@eks-distro-bot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: vivek-koppuru

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@eks-distro-bot eks-distro-bot merged commit dd57146 into aws:main Jul 6, 2022
@wongni wongni deleted the support-multiple-acs-for-cmk-2 branch July 6, 2022 16:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved lgtm size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants