Skip to content

Commit f56f525

Browse files
authored
Update data/control plane split design (#2729)
With the upcoming agent v3 release, the design for our control plane and data plane separation needs an update. Much of this is based off of discoveries from a recent PoC, as well as predicted design and implementation details that may be necessary. Added a new proposal that supersedes the old design.
1 parent 5b38d8c commit f56f525

File tree

5 files changed

+267
-0
lines changed

5 files changed

+267
-0
lines changed

design/control-data-plane-separation/design.md

+2
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,7 @@
11
# Separation of control and data plane
22

3+
**Archived; Superseded by [Proposal 1508](https://github.com/nginxinc/nginx-gateway-fabric/tree/main/docs/proposals/control-data-plane-split/README.md)**
4+
35
This document proposes a design for separating the control and data planes.
46

57
Issue #292: https://github.com/nginxinc/nginx-gateway-fabric/issues/292
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,265 @@
1+
# Proposal-1508: Separation of control and data plane
2+
3+
This document proposes a design for separating the control and data planes.
4+
5+
- Issue https://github.com/nginxinc/nginx-gateway-fabric/issues/1508
6+
- Status: Implementable
7+
8+
## Background
9+
10+
NGF composes its control and data plane containers into a single Kubernetes Pod. The control plane uses OS signals and a
11+
shared file system to configure and reload nginx. This architecture is problematic because the same RBAC policies govern
12+
the control and data planes and share CVE potential. A compromised control plane may impact the customer’s traffic. The
13+
Kubernetes API server may be affected if the data plane is compromised. In addition to security concerns, this
14+
architecture does not allow the control plane and data plane to scale independently.
15+
16+
An added benefit is that this architecture makes it possible for the control plane to provision the data plane, unlocking the ability to support multiple Gateways with a single control plane.
17+
18+
## Goals
19+
20+
- Data plane and control plane containers run in separate Pods
21+
- The communication channel between the control and data planes can be encrypted
22+
- Data planes can register with control plane
23+
- Data plane can scale independently of the control plane
24+
- RBAC policy for data plane follows the principle of least privilege. The data plane should not have access to the
25+
Kubernetes API server.
26+
- RBAC policy for control plane follows the principle of least privilege.
27+
- Control plane provisions the data plane when a Gateway resource is created.
28+
29+
## Design
30+
31+
We will be using [NGINX Agent v3](https://github.com/nginx/agent/tree/v3) as our agent for the data plane. Our control plane and agent will connect to each other over a secure gRPC channel. The control plane will send nginx configuration updates over the channel using the agent's API, and the agent will write the files and reconfigure nginx.
32+
33+
Whenever a user creates a Gateway resource, the control plane will provision an nginx deployment and service for that Gateway. The nginx/agent deployment will register itself with the control plane, and any Gateway API configuration that is created by a user for that Gateway will be sent to that nginx deployment.
34+
35+
### Deployment Architecture
36+
37+
![Deployment architecture](deployment-architecture.png)
38+
39+
- _Control Plane Deployment_: The control plane is a Kubernetes Deployment with one container running the NGF
40+
controller. The control plane will perform the same functions as it does today, but instead of
41+
configuring nginx by writing files to a shared volume, it will send the configuration to the agent via gRPC.
42+
- _Control Plane Service_: Exposes the control plane via a Kubernetes Service of type `ClusterIP`. The data plane will
43+
use the DNS name of the Service to connect to the control plane.
44+
- _Data Plane DaemonSet/Deployment_: The data plane can be deployed as either a DaemonSet or Deployment. The data
45+
plane contains a single container running both the agent and nginx processes. The agent will download the
46+
configuration from the control plane over a streaming RPC.
47+
- _NGINX Service_: Exposes nginx via a Kubernetes Service of type `LoadBalancer .` This is the entry point for the
48+
customer’s traffic. Note that this Service should not expose any of the agent’s ports.
49+
50+
#### Further Requirements and Implementation Details
51+
52+
- Both deployments should have read only filesystems.
53+
- Both deployments should have the minimal permissions required to perform their functions.
54+
- The nginx deployment should be configurable via the helm chart.
55+
- Downside of this is that these options will apply to all nginx instances.
56+
- We could introduce a CRD, but where would it attach? We already have NginxProxy which controls dynamic data plane configuration, and this may eventually attach to the Gateway instead of just the GatewayClass. Would a Deployment configuration fit in there, and would it be dynamic? That would require us to completely redeploy nginx if a user changes those settings.
57+
- We could start with the helm chart option, and rely on user feedback to see if we need to get more granular.
58+
- This could also involve creating a ConfigMap that the control plane consumes on startup and contains all nginx Deployment/Daemonset configuration, including NGINX Plus usage configuration.
59+
- Resources created for the nginx deployment (Service, Secrets, ConfigMap, etc.) should have configurable labels and annotations via the GatewayInfrastructure field in the Gateway resource. See [the GEP](https://gateway-api.sigs.k8s.io/geps/gep-1762/#automated-deployments).
60+
- Control plane creates the nginx deployment and service when a Gateway resource is created, in the same namespace as the Gateway resource. When the Gateway is deleted, the control plane deletes nginx deployment and service.
61+
- Control plane should label the nginx service and deployment with something related to the name of the Gateway so it can easily be linked. See [the GEP](https://gateway-api.sigs.k8s.io/geps/gep-1762/#automated-deployments).
62+
- Liveness/Readiness probes:
63+
- Control plane probe currently waits until we configure nginx. Going forward, this probe should just be when the control plane is ready to configure, in other words the controller runtime manager has started and returns 200 from its health endpoint.
64+
- Control plane should not restart data plane pods if they are unhealthy. This can either be left in the hands of the users, or if utilizing a liveness probe, Kubernetes will restart the pod.
65+
66+
### Agent Configuration
67+
68+
Using [nginx-agent.conf](https://github.com/nginx/agent/blob/v3/nginx-agent.conf), we can configure the agent on startup. Note that this example conf file may not have all available options. At a minimum, it will need to be configured for the following:
69+
70+
- command server is the NGF ClusterIP service
71+
- [tls settings](#encryption) for this connection
72+
- prometheus metrics are exposed and available on the expected port (`9113`) and path (`/metrics`)
73+
74+
### Connecting and Registering an Agent
75+
76+
The control plane and agent will communicate over gRPC. The agent will establish a gRPC connection to the control plane
77+
on start-up. The agent will gracefully retry to connect to the control plane, so the start order of the containers is
78+
not an issue. The gRPC runtime will handle the connection establishment and management. If an error occurs or the stream or connection is dropped, the connection must be reestablished.
79+
80+
#### Further Requirements and Implementation Details
81+
82+
- The control plane will need to run a gRPC server for the agent to connect to.
83+
- When an agent connects to the control plane, the payload _should_ contain the hostname (pod name) and nginx instanceID of the registering agent. This can be used to keep track of all agents/pods that are connected.
84+
- We need to be able to link an agent connection with a subscription. These are two different gRPC calls. `Subscribe` is where we actually send an nginx config to an agent. We need to ensure that we are sending the correct nginx config to the correct agent. Ideally we use metadata from the agent connection (maybe hostname, maybe we need middleware to extract token/uuid from grpc connection headers) and store that in the context. The context is then used in the `Subscribe` call, where we can extract the agent identifier and send the proper nginx config for that identifier.
85+
86+
Process: agent `Connects` to NGF. We get its identifier and pod name, add the identifier(s) to a context cache, track that connection, and create a subscription channel for it. Agent then `Subscribes`. The context passed in allows us to use the identifier to grab the proper subscription channel and listen on it. This channel will receive a `ConfigApplyRequest` when we have a new nginx config to write.
87+
88+
- If a single nginx deployment is scaled, we should ensure that all instances for that deployment receive the same config (and aren't treated as "different").
89+
- Each Gateway graph that the control plane builds internally should be directly tied to an nginx deployment.
90+
- Whenever the control plane sees an nginx instance become Ready, we send its config to it (it doesn't matter if this is a new pod or a restarted pod).
91+
- If no nginx instances exist, control plane should not send any configuration.
92+
- The control plane should check if a connection exists first before sending the config.
93+
- If the control plane is scaled, then we should mark non-leaders as Unready (return non-200 readiness probe). This will prevent nginx agents from connecting to the non-leaders (k8s removes the Unready Pods from the Endpoint pool), and therefore only the leader will send config and write statuses.
94+
- We will need to ensure that the leader Pod can handle many nginx connections.
95+
96+
<img src="graph-conns.png" alt="Scaled connections" width=500 style="display: block; margin: 0 auto">
97+
98+
### Agent API
99+
100+
The control plane will need to implement the proper gRPC APIs in order to connect and send configuration to the agent.
101+
102+
Useful references:
103+
104+
- [Basic mock control plane](https://github.com/nginx/agent/tree/v3/test/mock/grpc)
105+
- [Proto definitions](https://github.com/nginx/agent/blob/v3/docs/proto/protos.md)
106+
107+
The gRPC services to implement in the control plane are:
108+
109+
`CommandService`: this handles the connection and subscription to the agent. This will make a `ConfigApplyRequest` when we need to write nginx config. This request will contain a `FileOverview`, which is essentially a list of filenames that are to be sent.
110+
111+
`FileService`: when the agent receives a `ConfigApplyRequest`, it sends a `GetFile` request to this service to download the file contents contained in the original `FileOverview`.
112+
113+
Some API methods will not need to be implemented. For example, `UpdateFile` is used by the agent to send a file that was updated on the agent side to the control plane. In our case, this won't happen since our control plane has full control over the config. Methods relating to this functionality can be stubbed out.
114+
115+
### Metrics
116+
117+
Agent can be configured to expose metrics on a `/metrics` endpoint. Our control plane will need to configure agent to do this so that Prometheus can scrape each agent for nginx metrics.
118+
119+
### NGINX Plus
120+
121+
#### Upstream server state
122+
123+
In the current implementation using NGINX Plus, when only upstream servers change, NGF writes the upstream servers in the nginx config, but does not reload. It then calls the NGINX Plus API to update the servers. This process allows us to update the upstream servers using the API and not have to reload nginx, while still having the upstream servers exist in the nginx config for easy debugging and consistency. However, when using agent, any config write will result in a reload. To preserve the ability to update upstreams with the API without needing a reload, we'll have to utilize a `state` file instead of writing the servers directly in the nginx config. This way the list of servers is still available on the filesystem for debugging, but is written by the nginx when making the API call instead of by the control plane directly.
124+
125+
An example of what this looks like is defined [here](https://docs.nginx.com/nginx/admin-guide/load-balancer/dynamic-configuration-api/#configuring-persistence-of-dynamic-configuration).
126+
127+
We will send an UpdateHTTPUpstreams APIRequest to the agent, and it will make the NGINX Plus API call. The state file gets created automatically by nginx.
128+
129+
#### Secret duplication and synchronization
130+
131+
There are multiple Secrets that an NGINX Plus user can and will be creating. These include:
132+
133+
- JWT Secret for running NGINX Plus
134+
- Docker Registry Secret for pulling NGINX Plus
135+
- Client cert/key Secret for NIM connection
136+
- CA cert Secret for NIM connection
137+
138+
With the new architecture, a user should initially create those Secrets in the nginx-gateway namespace. The control plane will then need to duplicate these Secrets to any namespace where it deploys an nginx instance. This is because the Secrets are mounted to the nginx deployment. The control plane should also update the Secrets if the original Secrets are ever updated, meaning it will have to now watch for Secret updates.
139+
140+
This process must be documented so users are aware that their Secrets are being duplicated into other namespaces.
141+
142+
### Encryption
143+
144+
The agent and control plane communication channel will be encrypted. We will store the server certificate, key pair, and
145+
CA certificate in Kubernetes Secrets. The server Secret will live in the control plane namespace, and the agent Secret will live in the same namespace where the agent is deployed. The Secrets need to exist before the control plane and data planes are deployed.
146+
147+
- `nginx-gateway-cert`: This Secret will contain the TLS certificate and private key that the control plane will use to
148+
serve gRPC traffic.
149+
- `nginx-agent-cert`: This Secret will contain the CA bundle that validates the control plane’s certificate.
150+
151+
The Secrets will be mounted to the control plane and agent containers, respectively. If desired, we can make the Secret
152+
names and mount path configurable via flags. For production, we will direct the user to provide their own certificates.
153+
For development and testing purposes, we will provide a self-signed default certificate. In order to be secure by
154+
default, NGF should generate the default certificates and keypair during installation using a Kubernetes Job.
155+
156+
Cert-manager is probably the easiest way for a user to manage certs for this. [Reflector](https://github.com/emberstack/kubernetes-reflector) is a tool that can be used to sync Secrets across namespaces, so that all agents receive the certificate updates for the initial Secret created by cert-manager. Or our control plane just does this since we will likely have this logic anyway for copying NGINX Plus Secrets.
157+
158+
#### Certificate Rotation
159+
160+
Kubernetes automatically updates mounted Secrets when the content changes, but the control plane
161+
and agent must make sure they are using the latest certificates. We can achieve this by providing a callback in
162+
the [`tls.Config`][tls-config] for the gRPC server and client.
163+
164+
[tls-config]: https://pkg.go.dev/crypto/tls#Config
165+
166+
### Authorization
167+
168+
The agent will use a Kubernetes ServiceAccount token to authenticate with the control plane. The control plane will
169+
authenticate the token by sending a request to the Kubernetes [TokenReview API][token-review].
170+
171+
![Agent Connect](./connect.png)
172+
173+
On start-up the agent will create a gRPC client and connect to the control plane
174+
server using the server address, server token, and TLS options specified in the agent’s
175+
configuration file (see [Agent Configuration](#agent-configuration)). This connection is secured by TLS; see the
176+
[Encryption](#encryption) section for more information. The control plane will validate the token with
177+
Kubernetes by sending a TokenReview API request. If the token is valid, the bidirectional streaming connection
178+
between the agent and the control plane is established and left open for the lifetime of the agent.
179+
180+
#### Long-lived token v/s bound token
181+
182+
Long-lived tokens are JWT tokens for a ServiceAccount that are valid for the lifetime of the ServiceAccount. They are
183+
stored in Secrets and can be mounted to a Pod as a file or an environment variable. We can use the TokenReview API to
184+
verify the token. While long-lived tokens can still be created and used in Kubernetes, bound tokens are now the default
185+
and preferred option.
186+
187+
Bound ServiceAccount tokens are OpenID Connect (OIDC) identity tokens that are obtained directly from
188+
the [TokenRequest API][token-request] and are mounted into Pods using a [projected volume][projected-volume]. Bound
189+
tokens are more secure than long-lived tokens because they are time-bound, audience-bound, and object-bound.
190+
191+
- Time-bound: Bound tokens expire after a configurable amount of time. The default is 1 hour. The kubelet will
192+
periodically refresh the token before it expires.
193+
- Audience-bound: Bound tokens are only valid for a specific audience. The audience is a string that identifies the
194+
intended recipient of the token.
195+
- Object-bound: Bound tokens are bound to the Pod.
196+
197+
The TokenReview API only considers a bound token to be valid if the token is not expired, the audience of the token
198+
matches the audience specified in the TokenReview API request, and the Pod that the token is bound to is still present
199+
and running.
200+
201+
Bound tokens expire, and are written to the filesystem by the kubelet. While bound tokens are more secure than
202+
long-lived tokens, the agent needs to be modified to use them. The agent would need to be able to reload the tokens from
203+
the filesystem periodically. That would require the following changes in the agent code:
204+
205+
- Add a new configuration option to specify the path to the token file. Currently, the agent supports reading the token
206+
from an environment variable or the configuration file, not from a file.
207+
- Modify the gRPC client to fetch the token from a file before connecting to the control plane. Currently, the token is
208+
loaded on start-up and never refreshed. If the agent reconnects to the control plane, it will use the same token
209+
provided on start-up.
210+
211+
The agent team has these tasks on their roadmap as of the date of writing this design. However, as a backup plan, we can use the long-lived tokens.
212+
213+
To create the long-lived token, we will provide the following manifest:
214+
215+
```yaml
216+
apiVersion: v1
217+
kind: Secret
218+
metadata:
219+
name: nginx-agent-token-secret
220+
annotations:
221+
kubernetes.io/service-account.name: nginx-agent
222+
type: kubernetes.io/service-account-token
223+
```
224+
225+
And expose the token as an environment variable in the agent container:
226+
227+
```yaml
228+
env:
229+
- name: AGENT_SERVER_TOKEN
230+
valueFrom:
231+
secretKeyRef:
232+
name: nginx-agent-token-secret
233+
key: token
234+
```
235+
236+
The agent will load the token from the `$AGENT_SERVER_TOKEN` environment variable and add it to the `Authorization`
237+
header of the gRPC request when connecting to the control plane.
238+
239+
For a good comparison of long-lived and bound tokens, see [this blog post][bound-token-gke].
240+
241+
[token-review]: https://kubernetes.io/docs/reference/kubernetes-api/authentication-resources/token-review-v1/
242+
243+
[bound-token-gke]: https://cloud.google.com/blog/products/containers-kubernetes/kubernetes-bound-service-account-tokens
244+
245+
[token-request]: https://kubernetes.io/docs/reference/kubernetes-api/authentication-resources/token-request-v1/
246+
247+
[projected-volume]: https://kubernetes.io/docs/reference/access-authn-authz/service-accounts-admin/#bound-service-account-token-volume
248+
249+
## Edge Cases
250+
251+
The following edge cases should be considered and tested during implementation:
252+
253+
- The data plane fails to establish a connection with the control plane.
254+
- Existing connections between data plane and control plane are terminated during a download event.
255+
256+
In these cases, we expect the agent to be resilient. It should not crash or produce invalid config, and it should retry
257+
when possible.
258+
259+
## Performance
260+
261+
Our NFR tests will help ensure that performance of scaling and configuration have not degraded. We also may want to enhance these tests to include scaling nginx deployments.
262+
263+
## Open Questions
264+
265+
- nginx readiness/liveness probe...can agent expose any type of health endpoint?
Loading
Loading
Loading

0 commit comments

Comments
 (0)