Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add document for BGPPolicy #6524

Merged
merged 1 commit into from
Jul 25, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
219 changes: 219 additions & 0 deletions docs/bgp-policy.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,219 @@
# BGPPolicy

## Table of Contents

<!-- toc -->
- [What is BGPPolicy?](#what-is-bgppolicy)
- [Prerequisites](#prerequisites)
- [The BGPPolicy resource](#the-bgppolicy-resource)
- [NodeSelector](#nodeselector)
- [LocalASN](#localasn)
- [ListenPort](#listenport)
- [Advertisements](#advertisements)
- [BGPPeers](#bgppeers)
- [BGP router ID](#bgp-router-id)
- [BGP Authentication](#bgp-authentication)
- [Example Usage](#example-usage)
- [Combined Advertisements of Service, Pod, and Egress IPs](#combined-advertisements-of-service-pod-and-egress-ips)
- [Advertise Egress IPs to external BGP peers with more than one hop](#advertise-egress-ips-to-external-bgp-peers-with-more-than-one-hop)
- [Limitations](#limitations)
<!-- /toc -->

## What is BGPPolicy?

`BGPPolicy` is a custom resource that allows users to run a BGP process on selected Kubernetes Nodes and advertise
Service IPs, Pod IPs, and Egress IPs to remote BGP peers, facilitating the integration of Kubernetes workloads with an
external BGP-enabled network.

## Prerequisites

BGPPolicy was introduced in Antrea v2.1 as an alpha feature. A feature gate, `BGPPolicy`, must be enabled on antrea-agent
in the `antrea-config` ConfigMap for the feature to work, like the following:

```yaml
kind: ConfigMap
apiVersion: v1
metadata:
name: antrea-config
namespace: kube-system
data:
antrea-agent.conf: |
featureGates:
BGPPolicy: true
```

## The BGPPolicy resource

A BGPPolicy in Kubernetes is a Custom Resource Definition (CRD) object.

The following manifest creates a BGPPolicy object. It will start a BGP process with ASN `64512`, listening on port `179`,
on Nodes labeled with `bgp=enabled`. The process will advertise LoadBalancerIPs and ExternalIPs to a BGP peer at IP
address `192.168.77.200`, which has ASN `65001` and listens on port `179`:

```yaml
apiVersion: crd.antrea.io/v1alpha1
kind: BGPPolicy
metadata:
name: example-bgp-policy
spec:
nodeSelector:
matchLabels:
bgp: enabled
localASN: 64512
listenPort: 179
advertisements:
service:
ipTypes: [LoadBalancerIP, ExternalIP]
bgpPeers:
- address: 192.168.77.200
asn: 65001
port: 179
```

### NodeSelector

The `nodeSelector` field selects which Kubernetes Nodes the BGPPolicy applies to based on the Node labels. The field is
mandatory.

**Note**: If multiple BGPPolicy objects select the same Node, the one with the earliest creation time will be chosen
as the effective BGPPolicy.

### LocalASN

The `localASN` field defines the Autonomous System Number (ASN) that the local BGP process uses. The available private
ASN range is `64512-65535`. The field is mandatory.

### ListenPort

The `listenPort` field specifies the port on which the BGP process listens. The default value is 179. The valid port
range is `1-65535`.

### Advertisements

The `advertisements` field configures which IPs are advertised to BGP peers.

- `pod`: Specifies how to advertise Pod IPs. The Node IPAM Pod CIDRs will be advertised by setting `pod:{}`. Note that
IPs allocated by Antrea Flexible IPAM are not yet supported.
- `egress`: Specifies how to advertise Egress IPs. All Egress IPs will be advertised by setting `egress:{}`. A Node will
only advertise Egress IPs which are local (i.e., assigned to the Node).
- `service`: Specifies how to advertise Service IPs. The `ipTypes` field lists the types of Service IPs to be advertised,
which can include `ClusterIP`, `ExternalIP`, and `LoadBalancerIP`.
- All Nodes can advertise all ClusterIPs, respecting `internalTrafficPolicy`. If `internalTrafficPolicy` is set to
`Local`, a Node will only advertise ClusterIPs with at least one local Endpoint.
Comment on lines +101 to +102
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit surprising, because it feels like internalTrafficPolicy is not something that should impact traffic coming from external routers. However, ClusterIPs were not meant to be externally routable anyway so it's already a bit of an unusual situation. cc @tnqn. If this is the behavior that was agreed upon, then this is fine by me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I thought about this and didn't figure out a better implementation. I think in practice users will only advertise ClusterIPs when they don't have any ExternalIPs and LoadBalancerIPs, and they intend to reduce costs and improve performance by setting the internal traffic policy to local, then it seems fine to only advertise from Nodes that have local Pods to achieve the same goals for external traffic.

Besides, I suppose making external-to-clusterIP traffic not enforce internalTrafficPolicy would impact Antrea Proxy's implementation a lot.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose making external-to-clusterIP traffic not enforce internalTrafficPolicy would impact Antrea Proxy's implementation a lot.

makes sense

- All Nodes can advertise all ExternalIPs and LoadBalancerIPs, respecting `externalTrafficPolicy`. If
`externalTrafficPolicy` is set to `Local`, a Node will only advertise IPs with at least one local Endpoint.

antoninbas marked this conversation as resolved.
Show resolved Hide resolved
### BGPPeers

The `bgpPeers` field lists the BGP peers to which the advertisements are sent.

- `address`: The IP address of the BGP peer.
- `asn`: The Autonomous System Number of the BGP peer.
- `port`: The port number on which the BGP peer listens. The default value is 179.
- `multihopTTL`: The Time To Live (TTL) value used in BGP packets sent to the BGP peer, with a range of 1 to 255.
The default value is 1.
- `gracefulRestartTimeSeconds`: Specifies how long the BGP peer waits for the BGP session to re-establish after a
restart before deleting stale routes, with a range of 1 to 3600 seconds. The default value is 120 seconds.

## BGP router ID

The BGP router identifier (ID) is a 4-byte field that is usually represented as an IPv4 address.

For an IPv4-only or dual-stack Kubernetes cluster, the Node's IPv4 address (assigned to the transport interface) is used.

For IPv6-only clusters, if the `node.antrea.io/bgp-router-id` annotation is present on the Node and its value is a valid
IPv4 address string, we will use the provided value. Otherwise, a 32-bit integer will be generated by hashing the Node
name, then converted to the string representation of an IPv4 address, and the `node.antrea.io/bgp-router-id` annotation
is added / updated as necessary to reflect the selected BGP router ID.

## BGP Authentication

BGP authentication ensures that BGP sessions are established and maintained only with legitimate peers. Users can provide
authentication passwords for different BGP peering sessions by storing them in a Kubernetes Secret. The Secret must
be defined in the same Namespace as Antrea (`kube-system` by default) and must be named `antrea-bgp-passwords`.

By default, this Secret is not created, and BGP authentication is considered unconfigured for all BGP peers. If the
Secret is created like in the following example, each entry should have a key that is the concatenated string of the BGP
peer IP address and ASN (e.g., `192.168.77.100-65000`, `2001:db8::1-65000`), with the value being the password for that
BGP peer. If a given BGP peer does not have a corresponding key in the Secret data, then authentication is considered
disabled for that peer.

```yaml
apiVersion: v1
kind: Secret
metadata:
name: antrea-bgp-passwords
namespace: kube-system
stringData:
192.168.77.100-65000: "password"
2001:db8::1-65000: "password"
type: Opaque
```

## Example Usage

### Combined Advertisements of Service, Pod, and Egress IPs

In this example, we will advertise Service IPs of types LoadBalancerIP and ExternalIPs, along with Pod CIDRs and Egress
IPs from the selected Nodes to multiple remote BGP peers.

```yaml
apiVersion: crd.antrea.io/v1alpha1
kind: BGPPolicy
metadata:
name: advertise-all-ips
spec:
nodeSelector:
matchLabels:
bgp: enabled
localASN: 64512
listenPort: 179
advertisements:
service:
ipTypes: [LoadBalancerIP, ExternalIP]
pod: {}
egress: {}
bgpPeers:
- address: 192.168.77.200
asn: 65001
port: 179
- address: 192.168.77.201
asn: 65001
port: 179
```

### Advertise Egress IPs to external BGP peers with more than one hop

In this example, we configure the BGPPolicy to advertise Egress IPs from selected Nodes to a remote BGP peer located
multiple hops away from the cluster. It's crucial to set the `multihopTTL` to a value equal to or greater than the
number of hops, allowing BGP packets to traverse multiple hops to reach the peer.

```yaml
apiVersion: crd.antrea.io/v1alpha1
kind: BGPPolicy
metadata:
name: advertise-all-egress-ips
spec:
nodeSelector:
matchLabels:
bgp: enabled
localASN: 64512
listenPort: 179
advertisements:
egress: {}
bgpPeers:
- address: 192.168.78.201
asn: 65001
port: 179
multihopTTL: 2
luolanzone marked this conversation as resolved.
Show resolved Hide resolved
```

## Limitations

- The routes received from remote BGP peers will not be installed. Therefore, you must ensure that the path from Nodes
to the remote BGP network is properly configured and routable. This involves configuring your network infrastructure
to handle the routing of traffic between your Kubernetes cluster and the remote BGP network.
- Only Linux Nodes are supported. The feature has not been validated on Windows Nodes, though theoretically it can work
with Windows Nodes.
- Advanced BGP features such as BGP communities, route filtering, route reflection, confederations, and other BGP policy
mechanisms defined in BGP RFCs are not supported.