Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi-endpoint design for cloudstack provider #2399

Merged
merged 18 commits into from
Jun 17, 2022
Merged
Changes from 7 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
177 changes: 177 additions & 0 deletions designs/cloudstack-multiple-endpoints.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,177 @@
# Supporting Cloudstack clusters across endpoints

## Introduction

**Problem:**

The mangement API endpoint for Cloudstack is a potential points of failure. If one endpoint goes down, then control of all of everything goes down. So, we want to spread our services across many Cloudstack endpoints and hosts to protect against that.

For scalability, multiple Cloudstack endpoints will likely be required for storage and API endpoint throughput for our customer. Just one cluster creation as many as a thousand API calls to ACS (estimated). There are many ways to support this scale, but adding more Cloudstack hosts and endpoints is a fairly foolproof way to do so. Then, there’s the size and performance of the underlying database that each Cloudstack instance runs on.
maxdrib marked this conversation as resolved.
Show resolved Hide resolved

In CAPC, we are considering addressing the problem by extending our use of the concept of [Failure Domains](https://cluster-api.sigs.k8s.io/developer/providers/v1alpha2-to-v1alpha3.html?highlight=failure%20domain#optional-support-failure-domains) and distributing a cluster across the given ones. However, instead of a failure domain consisting of a zone on a single Cloudstack endpoint, we will redefine it to consist of the unique combination of a Cloudstack zone, api endpoint, account, and domain. In order to support this functionality in EKS-A, we need to have a similar breakdown where an EKS-A cluster can span across multiple endpoints and Failure Domains.

### Tenets

* ****Simple:**** simple to use, simple to understand, simple to maintain
* ****Declarative:**** intent based system, as close to a Kubernetes native experience as possible

### Goals and Objectives

As a Kubernetes administrator I want to:

* Perform preflight checks when creating/upgrading clusters which span across multiple failure domains
maxdrib marked this conversation as resolved.
Show resolved Hide resolved
* Create EKS Anywhere clusters which span across multiple failure domains
* Upgrade EKS Anywhere clusters which span across multiple failure domains
* Delete EKS Anywhere clusters which span across multiple failure domains

### Statement of Scope

**In scope**

* Add support for create/upgrade/delete of EKS-A clusters across multiple Cloudstack API endpoints
* Add test environment for CI/CD e2e tests which can be used as a second Cloudstack API endpoint
maxdrib marked this conversation as resolved.
Show resolved Hide resolved

**Not in scope**

*

**Future scope**

* Multiple network support to handle IP address exhaustion within a zone

## Overview of Solution

We propose to take the least invasive solution of repurposing the CloudstackDataCenterConfig to point to multiple failure domains, each of which contains the necessary
information for interacting with a Cloudstack failure domain. The assumption is that the necessary Cloudstack resources (i.e. image, computeOffering, ISOAttachment, etc.)
maxdrib marked this conversation as resolved.
Show resolved Hide resolved
will be available on *all* the Cloudstack API endpoints.

## Solution Details

### Interface changes
Currently, the CloudstackDataCenterConfig spec contains:
```
// Domain contains a grouping of accounts. Domains usually contain multiple accounts that have some logical relationship to each other and a set of delegated administrators with some authority over the domain and its subdomains
//
Domain string `json:"domain"`
// Zones is a list of one or more zones that are managed by a single CloudStack management endpoint.
Zones []CloudStackZone `json:"zones"`
// Account typically represents a customer of the service provider or a department in a large organization. Multiple users can exist in an account, and all CloudStack resources belong to an account. Accounts have users and users have credentials to operate on resources within that account. If an account name is provided, a domain must also be provided.
Account string `json:"account,omitempty"`
// CloudStack Management API endpoint's IP. It is added to VM's noproxy list
ManagementApiEndpoint string `json:"managementApiEndpoint"`
```

We would instead propose to remove all the existing attributes and instead, simply include a list of FailureDomain objects, where each FailureDomain object looks like
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it make sense to treat each FailureDomain as a datacenter? If so, is the reason we don't have multiple cloudstackdatacenterconfig purely engineering efforts too high?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This topic is discussed below in "Other approaches considered". There are many instances in the EKS-A codebase where only a single datacenterconfig object is expected to be present, and decisions are based on the Kind of that one datacenterconfig object. To refactor the entire code base only for the benefit of a single provider feels wasteful of engineering resources and more likely to introduce bugs.

With the current approach, we already kind of have multiple failure domains under a CloudstackDatacenterConfig object, only they're limited to the "zone" dimension. We want to expand that to also support the account, domain, and endpoint dimension

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By way of comparison, CAPV supports CAPI Failure Domains via node labeling (see Node Zones/Regions Topology). None the less, EKS-A expects those zones/regions to be managed through a single Datacenter.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These types of changes are breaking changes. Especially if we don't have a feature flag for Cloudstack, what about introducing a new object for this use case?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would this list look like here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These types of changes are breaking changes. Especially if we don't have a feature flag for Cloudstack, what about introducing a new object for this use case?

We do have a feature flag for Cloudstack currently. What sort of new object would you introduce?

How would this list look like here?

Currently the CloudstackDatacenterConfig object contains account, domain, zone, managementApiEndpoint, and zones. Zones are a proxy for the old concept of failure domains. Instead, we would propose that the CloudstackDatacenterConfig just contains a single entry, which is a list of FailureDomain objects. Then, each FailureDomain is independent and contains an account, domain, zone, and managementApiEndpoint

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check out https://github.com/aws/eks-anywhere/pull/2412/files for the new type definitions. We'll also have a transformer to convert the old attributes to using the new ones, under AvailabilityZone

g-gaston marked this conversation as resolved.
Show resolved Hide resolved
g-gaston marked this conversation as resolved.
Show resolved Hide resolved
```
// Domain contains a grouping of accounts. Domains usually contain multiple accounts that have some logical relationship to each other and a set of delegated administrators with some authority over the domain and its subdomains
// This field is considered as a fully qualified domain name which is the same as the domain path without "ROOT/" prefix. For example, if "foo" is specified then a domain with "ROOT/foo" domain path is picked.
// The value "ROOT" is a special case that points to "the" ROOT domain of the CloudStack. That is, a domain with a path "ROOT/ROOT" is not allowed.
//
Domain string `json:"domain"`
// Zones is a list of one or more zones that are managed by a single CloudStack management endpoint.
Zone CloudStackZone `json:"zone"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't this be Zones []CloudstackZone ? each domain can have multiple zonings right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good question. I believe not. Each failure domain can only have a single zone. If a customer wants a cluster spread across multiple zones, those would be two separate Failure Domains

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, exactly. The customer ask is to enhance the CAPC Failure Domain, which currently is equivalent to ACS Zone to be Endpoint+Zone+Domain+Account.

// Account typically represents a customer of the service provider or a department in a large organization. Multiple users can exist in an account, and all CloudStack resources belong to an account. Accounts have users and users have credentials to operate on resources within that account. If an account name is provided, a domain must also be provided.
Account string `json:"account,omitempty"`
// CloudStack Management API endpoint's IP. It is added to VM's noproxy list
ManagementApiEndpoint string `json:"managementApiEndpoint"`
```

and we would parse these resources and pass them into CAPC by modifying the templates we have currently implemented. We can then use this new model to read in credentials, perform pre-flight checks, plumb data to CAPC, and support upgrades in the controller. The goal would be to make these new resources backwards compatible via code
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

curious how to map this failureDomain list to CAPC? Can we maybe have some examples here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, we pass data to CAPC for zones in https://github.com/aws/eks-anywhere/blob/main/pkg/providers/cloudstack/config/template-cp.yaml#L38. We will do the same thing for Failure Domains. A failure domain should be the same as a zone is now, but also include account/domain and a url for the management host.


### Failure Domain

A failure domain is a CAPI concept which serves to improve HA and availability by destributing machines across "failure domains", as discussed [here](https://cluster-api.sigs.k8s.io/developer/providers/v1alpha2-to-v1alpha3.html?highlight=domain#optional-support-failure-domains).
CAPC currently utilizes them to distribute machines across CloudStack Zones. However, we now want to go a step further to support our customer and consider the following unique combination to be a failure domain:
maxdrib marked this conversation as resolved.
Show resolved Hide resolved

1. Cloudstack endpoint
2. Cloudstack domain
3. Cloudstack zone
4. Cloudstack account

You can find more information about these Cloudstack resources [here](http://docs.cloudstack.apache.org/en/latest/conceptsandterminology/concepts.html#cloudstack-terminology)

### `CloudstackDatacenterConfig` Validation

With the multi-endpoint system for the Cloudstack provider, customers reference a CloudstackMachineConfig and it's created across multiple failure domains. The implication
maxdrib marked this conversation as resolved.
Show resolved Hide resolved
is that all the Cloudstack resources such as image, ComputeOffering, ISOAttachment, etc. must be available in *all* the failure domains, or all the Cloudstack endpoints,
and these resources must be referenced by name, not unique ID. This would mean that for each CloudstackMachineConfig, we have to make sure that all the prerequisite
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe now we also support reference by ID. If we are removing support for ID, we should probably include that plan in this doc.

Copy link
Contributor Author

@maxdrib maxdrib Jun 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could include a preflight check which fails on the following conditions

  1. more than 1 Cloudstack endpoint
  2. Cloudstack datacenter config uses ID to reference some resource (zone, network, domain, account)
  3. Cloudstack machineconfigs use ID to reference some resources (DiskOffering, ComputeOffering, template, affinitygroupids)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would work, for sure.
However, is the benefit worth all the effort? I'm not necessarily concerned about writing that validation, but just about the burden of maintaining support for names in the codebase, but only partially.

That seems like extra unnecessary work.

Just for my understanding, why would specifying the name wouldn't work for multi availability zone? Shouldn't it work as long as the resource exists with the same name in all zones?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On second thought, we can also keep the existing preflight check logic which supports both name and ID. It'll simply fail when it tries to look up a resource by ID in a Cloudstack endpoint where that resource ID does not exist. That should be sufficient right? And that way we can maintain support for the ID feature while still continuing to add support for multi-az

why would specifying the name wouldn't work for multi availability zone? Shouldn't it work as long as the resource exists with the same name in all zones?

Yes, you're right. Specifying name is the only way that users can specify resources across multiple availability zones. The issue is when users try specifying a resource ID, since that may vary between availability zones.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I got it in reverse :(

Yeah, leaving the validation as is works for me, great idea. It still prevents users from submitting an invalid configuration and it doesn't add extra work neither now nor for maintenance.

Whenever we move to v1alpha2, we should probably consider just removing the support for ID.

Cloudstack resources are available in all the Cloudstack API endpoints.

In practice, the pseudocode would look like:

for failureDomain in failureDomains:
for machineConfig in machineConfigs:
validate resource presence with the failureDomain's instance of the CloudMonkey executable

### Cloudstack credentials


In a multi-endpoint Cloudstack cluster, each endpoint may have its own credentials. We propose that Cloudstack credentials will be passed in via environment variable in the same way as they are currently,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what abt user specifies a file path to the .ini and pass it as ENV?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could do that as well if you prefer. Although it might require some changes in the e2e test framework. I think there we're using the encoded value instead of any specific file. I'm not sure how/if we could download the file locally and then use it in the e2e tests but it does seem easier for the customer.

@vivek-koppuru any thoughts on this? I know we've discussed how to have customers pass data in at length.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both options could work right? What are we doing for snow again?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for snow we pass in file path as ENV vars

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From some discussions internally, it feels like storing credentials in a transient environment variable is more secure than in a file written to disk. That is why we prefer to go the env var route, although it is slightly more effort for the customer

only as a list corresponding to failure domains. Currently, these credentials are passed in via environment variable, which contains a base64 encoded .ini file that looks like

```
[Global]
api-key = redacted
secret-key = redacted
api-url = http://172.16.0.1:8080/client/api
```

We would propose an extension of the above input mechanism so the user could provide credentials across multiple Cloudstack API endpoints like

```
[FailureDomain1]
g-gaston marked this conversation as resolved.
Show resolved Hide resolved
api-key = redacted
secret-key = redacted
api-url = http://172.16.0.1:8080/client/api

[FailureDomain2]
api-key = redacted
secret-key = redacted
api-url = http://172.16.0.2:8080/client/api

[FailureDomain3]
api-key = redacted
secret-key = redacted
api-url = http://172.16.0.3:8080/client/api

...
```

We are also exploring converting the ini file to a yaml input file which contains a list of credentials and their associated endpoints. Either way, this environment variable would
be passed along to CAPC and used by the CAPC controller just like it is currently.

### Backwards Compatibility

Our customer currently has clusters running with the old resource definition. In order to support backwards compatibility in the CloudstackDatacenterConfig resource, we can
maxdrib marked this conversation as resolved.
Show resolved Hide resolved
1. Make all the fields optional and see if customers have the old fields set or the new ones
maxdrib marked this conversation as resolved.
Show resolved Hide resolved
2. Introduce an eks-a version bump with conversion webhooks

Between these two approaches, I would take the first and then deprecate the legacy fields in a subsequent release to simplify the code paths.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can still throw a warning if new clusters are created with the old fields.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense to me. In summary, we will continue to support the old fields until we can deprecate them.


However, given that the Cloudstack credentials are persisted in a write-once secret on the cluster, upgrading existing clusters may not be feasible unless CAPC supports overwriting that secret.

## User Experience


## Security

The main change regarding security is the additional credential management. Otherwise, we are doing exactly the same operations - preflight check with cloudmonkey,
create yaml templates and apply them, and then read/write eks-a resources in the eks-a controller. The corresponding change is an extension of an existing mechanism
and there should not be any new surface area for risks than there was previously.

## Testing

The new code will be covered by unit and e2e tests, and the e2e framework will be extended to support cluster creation across multiple Cloudstack API endpoints.

The following e2e test will be added:

simple flow cluster creation/deletion across multiple Cloudstack API endpoints:
g-gaston marked this conversation as resolved.
Show resolved Hide resolved

* create a management+workload cluster spanning multiple Cloudstack API endpoints
* delete cluster

## Other approaches explored

Another direction we can go to support this feature is to refactor the entire EKS-A codebase so that instead of all the failure domains existing inside the CloudstackDatacenterConfig
object, each CloudstackDatacenterConfig itself corresponds with a single failure domain. Then, the top level EKS-A Cluster object could be refactored to have a list of DatacenterRefs instead
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may actually provide value as we talked about wanting to support multiple environments in the future. However, I can see why it's a bigger step with less data for a specific provider.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the comments. Let's discuss in more detail tomorrow.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that you listed the alternative solution here.

However, this seems like it would be bleeding a provider specific topology into the main cluster object.
If we want to make a more holistic API change that includes other providers, we would need to consider what's the pattern that makes sense for them and find the common denominator among all. And I don't think we have done that.

of a single one. However, this approach feels extremely invasive to the product and does not provide tangible value to the other providers.