Skip to content

Commit

Permalink
Pods Egress DSCP QoS proposal
Browse files Browse the repository at this point in the history
Signed-off-by: Ori Braunshtein <obraunsh@redhat.com>
  • Loading branch information
oribon committed Feb 16, 2022
1 parent 5d153f6 commit e0a1867
Showing 1 changed file with 107 additions and 0 deletions.
107 changes: 107 additions & 0 deletions enhancements/network/egress-qos.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
---
title: OVN Pods Egress DSCP QoS
authors:
- "@oribon"
reviewers:
- "@trozet"
approvers:
- TBD
creation-date: 2022-02-16
last-updated: 2022-02-16
status: implementable
---

# OVN Pods Egress DSCP QoS

## Summary

Not all traffic has the same priority, and when there is contention for bandwidth, there should be a mechanism for objects outside the cluster to prioritize the traffic.
To enable this, we will use Differentiated Services Code Point (DSCP) which allows us to classify packets by setting a 6-bit field in the IP header, effectively marking the priority of a given packet relative to other packets as "Critical", "High Priority", "Best Effort" and so on.

By introducing a new CRD `PodQos`, users could specify a DSCP value for packets originating from pods on a set of namespaces heading to a specified CIDR.
The CRs will be watched by ovn-k, which in turn will configure OVN's [QoS Table](https://man7.org/linux/man-pages/man5/ovn-nb.5.html#QoS_TABLE).

## Motivation

Telco customers require support for DSCP marking capability for some of their 5G applications, giving some pods precedence over others.
The QoS markings will be consumed and acted upon by objects outside of the OpenShift cluster to optimize traffic flow throughout their networks.

### Goals

- Provide a mechanism for users to set DSCP on egress traffic coming from specific namespaces.

### Non-Goals

- Ingress QoS.

- Consolidating with current `kubernetes.io/egress-bandwidth` and `kubernetes.io/ingress-bandwidth` annotations.
Nonetheless, the work done here does not interfere with the current bandwidth QoS mechanism. # Q: Is this point relevant?

- The DSCP marking does not need to be handled or acted upon by OpenShift, just added to selected headers.

- Marking East/West traffic, exposing the DSCP value from the inner packet to the outer geneve packet. # Q: is this a goal? If so does it mean we allow dstCIDR to be empty / expect the user to specify an internal pod ip? https://www.mail-archive.com/ovs-dev@openvswitch.org/msg59787.html

## Proposal

To achieve egress DSCP marking on pods, we introduce a new CRD `PodQos`, which lets users specify a namespaceSelector, DSCP value and destination CIDR - meaning traffic coming from pods on specific namespaces heading to the destination CIDR will be marked with the given DSCP value.

```yaml
kind: PodQos
apiVersion: k8s.ovn.org/v1
metadata:
name: ns1-46
namespace: default # Q: Should this be cluster-scoped?
spec:
# Q: Is it necessary to leave room for potential ingress configuration in the future or have this CRD focused on egress?
# Q: is having a field "direction" with "egress" as the only possible value for now better?
egress:
namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: ns1
dscp: 46
dstCIDR: 0.0.0.0/0 # Q: Do we want this CIDR / slice of CIDRs/IPs / single IP / ... ?
```
OVN-K watches these resources, creating/updating/deleting QoS objects in nbctl accordingly.
For example, assuming there's a single pod `app` in namespace `ns1` on node `node1` and the above `PodQos` created the equivalent of:
`ovn-nbctl qos-add <node1> from-lport <priority> "inport == \"ns1_app\" && ip4.dst == 0.0.0.0/0" dscp=46` # Q: Will we have a fixed priority?
will be executed.
In addition it'll watch namespaces and pods to decide if further updates are needed.

### User Stories

### API Extensions

### Implementation Details/Notes/Constraints

- Adding a new CRD `PodQos` under the `k8s.ovn.org/v1` version to `pkg/crd`.

- Adding a controller that watches `PodQoses`, `Pods`, `Namespaces`, producing the relevant QoS objects in nbdb.
Covering the scenarios where overlapping resources are requested.
### Risks and Mitigations

## Design Details

### Test Plan

### Graduation Criteria

#### Dev Preview -> Tech Preview

#### Tech Preview -> GA

#### Removing a deprecated feature

### Upgrade / Downgrade Strategy
### Version Skew Strategy

### Operational Aspects of API Extensions

#### Failure Modes

#### Support Procedures

## Implementation History

## Drawbacks
## Alternatives

0 comments on commit e0a1867

Please sign in to comment.