Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs(MADR): mesh service multizone ux #9979

Merged
Merged
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
361 changes: 361 additions & 0 deletions docs/madr/decisions/046-meshservice-multizone.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,361 @@
# MeshService multizone UX

* Status: accepted

Technical Story: https://github.com/kumahq/kuma/issues/9706

## Context and Problem Statement

With the introduction of `MeshService`, we need to decide how does it work with multizone deployments.

There are two main use cases for multicluster connectivity:

**Access service from another cluster**
Service A is deployed only in zone X and the service in zone Y wants to consume it.

**Replicate service over multiple clusters**
Service A is deployed across multiple clusters for high availability.
This strategy is used with "lift and shift" - migrating a workload from on prem to kubernetes.

The current state is that a service is represented by `kuma.io/service` tag and as long as the tag is the same between zones, we consider this the same service.
It means that a Kube service with the same name and namespace in two clusters is considered the same.
By default, if you don't have `MeshLoadbalancingStrategy` in the cluster, we load balance between local and remote zones.

This has a security disadvantage that if you take over the zone, you can advertise any service.

## Considered Options

* Access service from another cluster use case
* MeshService synced to other zones
* Service export / import
* Export field
* MeshTrafficPermission
* Replicated services use case
* MeshService with dataplaneTags selector applied on global CP
* MeshMultiZoneService
* MeshService with meshServiceSelector selector applied on global CP

## Decision Outcome

Chosen options:
* Access service from another cluster use case
* MeshService synced to other zones, because it's the only way
* Export field, because it's more explicit and more managable
* Replicated services use case
* MeshMultiZoneService, with potentially diverging to "MeshService with meshServiceSelector selector applied on global CP" based on the future policy matching MADR.

## Pros and Cons of the Options

### Access service from another cluster use case

#### MeshService synced to other zones

MeshService is an object created on the zone that matches a proxies in this specific zone.
jakubdyszkiewicz marked this conversation as resolved.
Show resolved Hide resolved
This object is the synced to the global control plane.

To solve the first use case we can then sync MeshService to other zones.
Let's say we have the following MeshService in zone `east`
```yaml
kind: MeshService
metadata:
name: redis
namespace: redis-system
labels:
kuma.io/display-name: redis
k8s.kuma.io/namespace: redis-system
kuma.io/zone: east
spec:
selector:
dataplaneTags:
lahabana marked this conversation as resolved.
Show resolved Hide resolved
app: redis
k8s.kuma.io/namespace: redis-system
ports: ...
status: ...
```

This will be synced to global and look like this
```yaml
kind: MeshService
metadata:
name: redis-xyz12345 # we add hash suffix of (mesh, zone, namespace)
namespace: kuma-system # not that it's synced to system namespace
labels:
kuma.io/display-name: redis # original name of the object on the Kubernetes
k8s.kuma.io/namespace: redis-system
kuma.io/zone: east
kuma.io/origin: zone
spec:
selector:
dataplaneTags:
app: redis
k8s.kuma.io/namespace: redis-system
kuma.io/zone: east-1
lahabana marked this conversation as resolved.
Show resolved Hide resolved
ports: ...
status: ...
```

This will be then synced to other zones in this form, which is the same syncing strategy that we already have with ZoneIngress.

Because of the hashes and syncing only to system namespace, we avoid clashes of synced services and existing services in the cluster.

**Hostname and VIP management**

Advertising the service is only part of the solution. We also need to provide a way to consume this service.

We cannot reuse:
* existing hostnames like `redis.redis-system.svc.cluster.local`, because there might be already a redis in cluster `east.
* ClusterIP, because each cluster network is isolated. We might clash with IPs.

Therefore `status` object should not be synced from zone `east` to zone `west`.

To provide hostname, we can use `HostnameGenerator` described in MADR #XXX (todo provide a number when merged).
```yaml
type: HostnameGenerator
jakubdyszkiewicz marked this conversation as resolved.
Show resolved Hide resolved
name: k8s-zone-hostnames
spec:
template: {{ label "kuma.io/display-name" }}.{{ label "k8s.kuma.io/namespace" }}.svc.mesh.{{ label "kuma.io/zone" }}
```
so that the synced service in zone `west` gets this hostname `redis.redis-system.svc.mesh.east`.

To assign Kuma VIP, we follow the process descirbed in MADR #XXX (todo provide a number when merged).

When generating the XDS configuration for the client in zone `west`, we check if MeshService has `kuma.io/zone` tag that is not equal to local zone.
jakubdyszkiewicz marked this conversation as resolved.
Show resolved Hide resolved
If so, then we need to point Envoy Cluster to endpoint for ZoneIngress of `west`.
This way we get rid of `ZoneIngress#availableServices`, which was one of the initial motivation of introducing MeshService (see MADR #YYY).

#### Service export / import

One thing to consider is whether we should sync all MeshServices to other clusters by default.
It's reasonable to consider that only a subset of services is consumed cross zone.

Trimming it could benefit in:
* performance. Fewer objects to store and sync around.
However, changes to MeshServices spec are rather rare. We don't consider this to be a significant perf issue.
* security. Unexported Mesh Service would not even be exposed with ZoneIngress.

Explicit import seems to have very little value, because clients needs to explicitly use service from a specific zone.

##### Export field

We can introduce `spec.export.mode` field to MeshService with values `All`, `None`.
We want to have nested mode in export, so we can later extend api if needed. For example `spec.export.mode: Subset` and `spec.export.list: zone-a,zone-b`.
On Kubernetes, this would be controlled via `kuma.io/export-mode` annotation on `Service` object.
On Universal, where MeshService is synthesized from Dataplane objects, this would be controlled via `kuma.io/export-mode` tag on a Dataplane object.

If missing, then default behavior from kuma-cp configuration will be applied. The default CP configuration is `spec.export.mode: All`.
lahabana marked this conversation as resolved.
Show resolved Hide resolved
Then it's up to a mesh operator if they want to sync everything by default or opt-in exported services.
jakubdyszkiewicz marked this conversation as resolved.
Show resolved Hide resolved

##### MeshTrafficPermission

Similarly to what we do with reachable services (see [MADR 031](https://github.com/kumahq/kuma/blob/master/docs/madr/decisions/031-automatic-rechable-services.md)),
We could implicitly check if service can be consumed by proxy in other zone and only then sync the service.
This however is difficult, we either need to require `kuma.io/zone` for it to work or check which proxies are in other zones which is expensive.

### Replicated services use case

As opposed to the current solution with `kuma.io/service`, redis in zone `east` is a different service than redis in zone `west`.
There are two ways we could go about aggregating those services.

#### MeshService with dataplaneTags selector applied on global CP

To achieve replicated service, we define a MeshService on global CP.
This MeshService is then synced to all zones.

```yaml
kind: MeshService
metadata:
name: redis-xzh64d
namespace: kuma-system
labels:
team: db-operators
kuma.io/mesh: default
kuma.io/origin: global
spec:
selector:
dataplane:
tags: # it has no kuma.io/zone so it aggregates all redis from redis-system from all zones
app: redis
k8s.kuma.io/namespace: redis-system
status:
ports:
- port: 1234
protocol: tcp
addresses:
- hostname: redis.redis-system.svc.mesh.global
status: Available
origin: global-generator
vips:
- ip: 241.0.0.1
type: Kuma
zones: # computed on global. The rest of status fields are computed on the zone
- name: east
- name: west
```

The disadvantages:
* We need to do a computation of list of zones on global cp, because we don't sync all Dataplanes across all zones.
This can be expensive because global has list of all data plane proxies.
* We wouldn't have a capability of apply "multizone" service on specific zone, because we don't have all DPPs in all zones.
michaelbeaumont marked this conversation as resolved.
Show resolved Hide resolved

**Hostname and VIP management**
We can provide hostnames by selecting by `kuma.io/origin` label.

```yaml
type: HostnameGenerator
name: global-generator
spec:
meshServiceSelector:
matchLabels:
kuma.io/origin: global
template: {{ label "kuma.io/display-name" }}.{{ label "k8s.kuma.io/namespace" }}.svc.mesh.global
```

#### MeshMultiZoneService

If we assume that replicated service is aggregation of multiple services, we can define a specific object for it.
`MeshMultiZoneService` selects services instead of proxies.

```yaml
kind: MeshMultiZoneService
metadata:
name: redis-xzh64d
namespace: kuma-system
labels:
team: db-operators
kuma.io/mesh: default
spec:
meshServiceSelector: # we select services by labels, not dataplane objects
matchLabels:
jakubdyszkiewicz marked this conversation as resolved.
Show resolved Hide resolved
k8s.kuma.io/service-name: redis
k8s.kuma.io/namespace: redis-system
jakubdyszkiewicz marked this conversation as resolved.
Show resolved Hide resolved
status: # computed on the zone
ports:
- port: 1234
protocol: tcp
addresses:
- hostname: redis.redis-system.svc.mesh.global
status: Available
origin: global-generator
vips:
- ip: 241.0.0.1
jakubdyszkiewicz marked this conversation as resolved.
Show resolved Hide resolved
type: Kuma
zones:
- name: east
- name: west
```

Considered names alternatives:
* MeshGlobalService
* GlobalMeshService
jakubdyszkiewicz marked this conversation as resolved.
Show resolved Hide resolved

The advantages:
* We don't need to compute anything on global cp. In the example above, both redis MeshServices have been synced to the zone.
* We don't need to traverse over all data plane proxies
* Users don't need to redefine ports
lahabana marked this conversation as resolved.
Show resolved Hide resolved
* Policy matching easier to implement (will be covered more in policy matching MADR)
* We can expose a capability to apply this locally on one zone.
lahabana marked this conversation as resolved.
Show resolved Hide resolved

The disadvantages:
* A separate resource
* Cannot paginate both services on one page in the GUI

**Hostname and VIP management**
HostnameGenerator has to select `MeshMultiZoneService`, so we need `targetRef` that can accept both `MeshService` and `MeshMultiZoneService`

```yaml
type: HostnameGenerator
name: global-generator
spec:
targetRef:
jakubdyszkiewicz marked this conversation as resolved.
Show resolved Hide resolved
kind: MeshMultiZoneService
template: {{ label "kuma.io/display-name" }}.{{ label "k8s.kuma.io/namespace" }}.svc.mesh.global
```

#### MeshService with serviceSelector selector applied on global CP

Alternatively we can combine both solutions to have `MeshService`, but with `meshServiceSelector`

```yaml
kind: MeshService
metadata:
name: redis-xzh32a
namespace: kuma-system
labels:
team: db-operators
kuma.io/mesh: default
kuma.io/origin: global
spec:
meshServiceSelector: # we select services by labels, not dataplane objects
jakubdyszkiewicz marked this conversation as resolved.
Show resolved Hide resolved
matchLabels:
k8s.kuma.io/service-name: redis
k8s.kuma.io/namespace: redis-system
```

The advantages:
* Same as `MeshMultiZoneService`

The disadvantages:
* It might be confusing that MeshService may select both data plane proxies or services
* Policy matching might be more convoluted (will be covered more in policy matching MADR).

**Hostname and VIP management**
Same as with `MeshService with dataplaneTags selector applied on global CP`.

### Explicit namespace sameness

Whether we choose `MeshMultiZoneService` or `MeshService` for replicated services use case it will require extra step than the existing implementation.
This extra step is defining an object that represent global service.
It is a disadvantage if user only wants to use global services, they would need to create every single one of them.
If we see demand for it, we can create a controller on global CP that would generate global services that has the same set of defined tags (for example: `kuma.io/display-name` + `k8s.kuma.io/namespace`).

## Multicluster API

[Multicluster API](https://github.com/kubernetes/enhancements/blob/master/keps/sig-multicluster/1645-multi-cluster-services-api/README.md) solves a way to manage services across multiple clusters. The way it works is by introducing two objects - `ServiceImport` and `ServiceExport`.
To expose a service, a user needs to apply ServiceExport

```yaml
apiVersion: multicluster.k8s.io/v1alpha1
kind: ServiceExport
metadata:
name: redis
namespace: kuma-demo
```

and then in every cluster that wants to consume this we need to apply `ServiceImport`
jakubdyszkiewicz marked this conversation as resolved.
Show resolved Hide resolved
```yaml
apiVersion: multicluster.k8s.io/v1alpha1
kind: ServiceImport
metadata:
name: redis
namespace: kuma-demo
spec:
ips:
- 42.42.42.42
type: "ClusterSetIP"
ports:
- protocol: TCP
port: 6379
status:
clusters:
- cluster: east
```

Service then can be addressed by `redis.kuma-demo.svc.clusterset.local`

* Export field controlled by annotation described in MADR is an equivalent of ServiceExport.
* Synced MeshService is similar to ServiceImport.
The difference is that the object is synced to the mesh system namespace, and synced service is bound to a specific zone.
We get access to services of a **specific** cluster without any extra action of importing.
For example `redis-kuma-demo.svc.cluster.east`
* MeshMultiZoneService is an equivalent of `ServiceImport`.
It is also "zone agnostic" - the client do not care in which zone the Service is deployed.
Just like ServiceImport it requires an extra manual step - creating an object.
You don't need to create this in every cluster, you can create MeshMultiZoneService on Global CP and it's synced to every cluster.
When applied on Global CP, it's synced to the system namespace.

MeshService + MeshMultiZoneService seems to be more flexible approach.

This approach does not block us from adopting multicluster API in the future if it gains traction. We could either:
* Convert ServiceImport to MeshMultiZoneService.
Write the controller that listens on ServiceExport to annotate Service with `kuma.io/export`
* Natively switch to it as a primary objects. This might be tricky when we consider supporting Universal deployments.
Loading