You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
-[Wins used to enable privileged/host network DaemonSets](#wins-used-to-enable-privilegedhost-network-daemonsets)
41
+
-[Provisioning script creates a fake host network](#provisioning-script-creates-a-fake-host-network)
38
42
-[Kubeadm manages the kubelet start / stop as a service](#kubeadm-manages-the-kubelet-start--stop-as-a-service)
39
43
-[Kubeadm makes assumptions about systemd and Linux](#kubeadm-makes-assumptions-about-systemd-and-linux)
40
44
-[Windows vs Linux host paths](#windows-vs-linux-host-paths)
@@ -93,19 +97,15 @@ The motivation for this KEP is to provide a tool to allow users to take a Window
93
97
94
98
### Goals
95
99
96
-
* Create and maintain a Powershell script to install and run Kubernetes prerequisites on Windows
97
-
* Support kubeadm join for Windows
98
-
* Support kubeadm reset for Windows
99
-
* Create and maintain a Powershell script to deploy Windows CNIs
100
+
* Create and maintain a Powershell script to install kubelet, kubeadm and [wins](https://github.com/rancher/wins/)
101
+
* Support kubeadm join, reset and upgrade
102
+
* Provide DaemonSets to run kube-proxy and Flannel
100
103
101
104
### Non-Goals
102
105
103
106
* Installing the Container Runtime (e.g. Docker or containerd)
104
-
* Implement kubeadm init for Windows (at this time)
105
-
* Implement kubeadm join --control-plane for Windows (at this time)
106
-
* Supporting upgrades using kubeadm upgrade for Windows (to be revisited for Beta)
107
-
* Running kube-proxy as a DaemonSet on Windows (to be revisited for Beta)
108
-
* Running Flannel as a DaemonSet on Windows (to be revisited for Beta)
107
+
* Implement kubeadm init
108
+
* Implement kubeadm join --control-plane
109
109
110
110
## Proposal
111
111
@@ -115,23 +115,35 @@ The motivation for this KEP is to provide a tool to allow users to take a Window
115
115
116
116
A user will have a Windows machine that they want to join to an existing Kubernetes cluster.
117
117
118
-
1. The user will download a set of required binaries (kubelet, kube-proxy, kubeadm, kubectl, Flannel) using a script. The same script will also wrap kubeadm execution.
118
+
1. The user will download a set of required binaries (kubelet, kubeadm, and wins) using a script.
119
119
120
-
2. The script will register kubelet and kube-proxy as Windows services.
120
+
1. The script will register kubelet as a Windows service.
121
121
122
-
3. The script will upload a default ConfigMap to the cluster for the default Windows KubeletConfiguration if one does not already exist.
122
+
1. The user will run "kubeadm join ..." to join the node to the cluster. In this step kubeadm will run preflight checks and proceed with the regular join process.
123
123
124
-
4. The script will run "kubeadm join ..." to join the node to the cluster. In this step kubeadm will run preflight checks and proceed with the regular join process.
124
+
1. kubeadm will restart the kubelet service using flags that kubeadm fed to the Windows service. kubeadm will proceed to bootstrap the node. After this process is finished the node should show with the status of NotReady.
125
125
126
-
5. kubeadm will restart the kubelet service using flags that kubeadm fed to the Windows service. kubeadm will proceed to bootstrap the node. After this process is finished the node should show with the status of NotReady.
126
+
1. The user will then deploy the Flannel and kube-proxy DaemonSets. These will run and initialize container networking.
127
127
128
-
6. The same script will then configure FlannelD. The script will (re-)register FlannelD as a service and start it with the correct configuration, optionally using parameters from kubeadm.
128
+
1. The node status should become Ready.
129
129
130
-
7. kube-proxy will do the same steps as FlannelD shortly after.
130
+
### Implementation Details/Notes/Constraints
131
131
132
-
8. The node status should be Ready.
132
+
#### Wins used to enable privileged/host network DaemonSets
133
133
134
-
### Implementation Details/Notes/Constraints
134
+
Windows has no native support for creating privileged containers or attaching them to the host network.
135
+
[Wins](https://github.com/rancher/wins/) is a project from Rancher that works around this shortcoming by exposing an API
136
+
to run processes on the host. This API can be exposed to containers via a [named pipe](https://docs.microsoft.com/en-us/windows/win32/ipc/named-pipes)
137
+
to allow those containers to launch processes that escape the restrictions normally imposed by Windows containerization.
138
+
Additional details about this approach can be found [here](https://docs.google.com/document/d/1dXLs2XR8tqueSYWxAb0OGzKqKzx1pR6l2b1JXXT8kqA/edit?usp=sharing).
139
+
140
+
#### Provisioning script creates a fake host network
141
+
142
+
In order for the Flannel and kube-proxy DaemonSets to run before CNI has been initialized, they need to be running with `hostNetwork: true` on their Pod specs. This is the established pattern on Linux for bootstrapping CNI, and we are utilizing it here as well. This is in spite of the fact that our containers will not actually need networking at all since the actual Flannel/kube-proxy process will be running outside of the container through wins.
143
+
144
+
In the provisioning script we create a Docker network named `host` but that is actually of the type `NAT`. This is because the kubelet only checks for this network by name, and Docker does not support networks of type `host` on Windows.
145
+
146
+
The kubelet on Windows previously would panic when told to run a hostNetwork pod, but changes made in [#84649](https://github.com/kubernetes/kubernetes/pull/84649) allow these pods to run. As a result this means we can only support kubelets from 1.17 onwards.
135
147
136
148
#### Kubeadm manages the kubelet start / stop as a service
137
149
@@ -171,76 +183,53 @@ Windows related adjustments to default paths might be required.
171
183
172
184
#### Windows vs Linux host paths
173
185
174
-
Kubeadm makes a number of non-portable assumptions about paths. E.g. “/etc/kubernetes” is a hardcoded path in kubeadm.
186
+
Kubeadm makes use of several Linux-specific paths. E.g. “/etc/kubernetes” is a hardcoded path in kubeadm. We intend to use these paths on Windows as well, though we would be open to making them follow Windows path standards later.
175
187
176
-
We will use "C:\kubernetes" to hold the paths that are normally created for Linux.
177
-
178
-
We need to evaluate the kubeadm codebase for such instances of non-portable paths - CRI sockets, Cert paths, etc. Such paths need to be defaulted properly in the kubeadm configuration API.
179
-
180
-
A consolidated list of paths where kubeadm installs files to a set of paths needs to be created and updated to comply with the Windows OS model. At least a single PR against kubeadm will be required to modify the Windows defaults.
181
-
182
-
Last, as new paths are created, restrictive Access Control Lists (ACL) for Windows should be applied. Golang does not convert Posix permissions to an appropriate Windows ACL, so an additional step is needed. See [mkdirWithACL from moby/moby](https://github.com/moby/moby/blob/e4cc3adf81cc0810a416e2b8ce8eb4971e17a3a3/pkg/system/filesys_windows.go#L103)) for an example. This step will be performed by kubeadm.
188
+
Last, for the paths used, restrictive Access Control Lists (ACL) for Windows should be applied. Golang does not convert Posix permissions to an appropriate Windows ACL, so an additional step is needed. See [mkdirWithACL from moby/moby](https://github.com/moby/moby/blob/e4cc3adf81cc0810a416e2b8ce8eb4971e17a3a3/pkg/system/filesys_windows.go#L103) for an example. This step will be performed by the provisioning script.
183
189
184
190
#### Kube-proxy deployment
185
191
186
-
On Linux, kube-proxy is deployed as a DaemonSet in the kubeadm init phase. However, kube-proxy cannot run as a container in Windows since Windows does not support privileged containers. Kube-proxy should therefore be run as a Windows service so that it is restarted by windows control manager automatically and has lifecycle control.
187
-
188
-
We need to modify the Linux kube-proxy DaemonSet to not deploy on Windows nodes. A PR is already in flight for that [76327](https://github.com/kubernetes/kubernetes/pull/76327). *Merging this PR is mandatory for this proposal*.
189
-
190
-
Running kube-proxy as a Windows service from kubeadm is out of scope for this proposal. This is due to the fact that we don’t want the changes in kubeadm to be intrusive to the existing method of running kube-proxy as a DaemonSet on Linux. This can end up requiring an abstraction layer that is far from ideal.
191
-
192
-
The proposed Windows wrapper script that executes kubeadm will also manage the restart of the kube-proxy Windows service.
193
-
194
-
Long term and ideally, kube-proxy should be run as a DaemonSet on Windows.
192
+
On Linux, kube-proxy is deployed as a DaemonSet in the kubeadm init phase. We will supply a DaemonSet that will run on Windows and use the wins API to launch kube-proxy.
195
193
196
194
#### CNI plugin deployment
197
195
198
-
On Linux, CNI plugins are deployed via kubectl and run as a DaemonSet. However, on Windows, CNI plugins need to run on the node, and cannot run in containers (again because Windows does not currently support privileged containers). Azure-CNI, win-bridge (compatible with kubenet), and Flannel all need the binary and config stored on the node.
196
+
On Linux, CNI plugins are deployed via kubectl and run as a DaemonSet. We will
197
+
supply DaemonSets that will allow users to run Flannel configured in either
198
+
L2Bridge/Host-Gateway or VXLAN/Overlay mode. These ideally can be upstreamed into the Flannel
199
+
project.
199
200
200
-
This proposal plans for FlannelD as the default option. Currently, FlannelD has to be started before the kube-proxy Windows service is started. FlannelD creates an HNS network on the Windows host, and kube-proxy will crash if it cannot find the network. This should be fixed in the scope of this project so that kube-proxy will wait until the network comes up. Therefore, kube proxy can be started at any time.
201
+
If users wish to use a different plugin they can create a DaemonSet to be applied in place of the Flannel DaemonSet.
201
202
202
-
However, if FlannelD is deployed in VXLAN (Overlay) mode, then we need to rewrite the KubeProxyConfiguration with the correct Overlay specific values, and kube-proxy will need to read this config again. This is not true for Host-Gateway (L2Bridge) mode. The script will have a flag that allows users to choose between the two networking modes.
203
+
### Risks and Mitigations
203
204
204
-
If the users wish to use a different plugin they will technically opt-out of the supported setup for kubeadm based Windows worker nodes in the Alpha release.
205
+
**Risk**: Wins proxy introduces new security vector
205
206
206
-
Long term, any CNI plugin should be supported for kubeadm based Windows worker nodes.
207
+
The same functionality that allows us to now run privileged DaemonSets on Windows could be used maliciously to perform
208
+
unwanted behavior on Windows nodes. This brings to Windows problems that already exist on Linux and now require the same
209
+
mitigations, namely Pod Security Policies (PSP).
207
210
208
-
### Risks and Mitigations
211
+
*Mitigation*: Access to the wins named pipe can be restricted using a PSP that either disables
212
+
`hostPath` volume mounts or restricts the paths that can be mounted. A sample PSP will be provided.
209
213
210
-
**Risk**: Versioning of the wrapper script can become complicated
214
+
**Risk**: Versioning of the script can become complicated
211
215
212
216
Versioning of the script per-Kubernetes version can become a problem if a certain new version diverges in terms of download URLs, flags and configuration.
213
217
214
218
*Mitigation*: Use git branches to version the script in the repository where it is hosted.
215
219
216
-
**Risk**: The wrapper script is planned to act as both a downloader and runner of the downloader binaries, which might cause scope and maintenance issues.
217
-
218
-
*Mitigation*: Use separate scripts, the first one downloads the binaries and the wrapper/runner script then setups the environment. The user then executes the wrapper script.
219
-
220
-
The initial plan is to give the single script method a shot with different arguments that will execute the different stages (downloading, setting up the environment, deploying the CNI).
221
-
222
-
**Risk**: Flannel or kube-proxy require special configuration that kubeadm does not handle.
223
-
224
-
*Mitigation*: Allow the user to pass custom configuration files that the wrapper script can feed into the components in question.
225
-
226
-
**Risk**: Failing or missing preflight checks on Windows
227
-
228
-
*Mitigation*: The existing kubeadm codebase already has good abstraction in this regard. Still, a PR that makes some non-intrusive adjustments in _windows.go files might be required.
229
-
230
220
**Risk**: Permissions on Windows paths that kubeadm generates can pose a security hole.
231
221
232
222
kubeadm creates directories using MakeAll() and such directories are strictly Linux originated for the time being - such as /etc.
233
223
234
224
On Windows, the creation of such a path can result in sensitive files to be exposed without the right permissions.
235
225
236
-
*Mitigation*: Provide a SecureMakeAll() func in kubeadm, that ensures secure enough permissions on both Windows & Linux, and replace usage of MakeAll()
226
+
*Mitigation*: Create ACLs from the provisioning script that give similar access controls to those on Linux.
237
227
238
228
## Design Details
239
229
240
230
### Test Plan
241
231
242
-
E2e testing for kubeadm on Windows is still being planned.
243
-
One available option is to run “kubeadm join” periodically on Azure nodes and federate the reports to test-infra/testgrid.
232
+
e2e testing for kubeadm on Windows will be performed using Cluster API on AWS.
244
233
The CI signal will be owned by SIG Windows.
245
234
246
235
### Graduation Criteria
@@ -253,26 +242,25 @@ This proposal targets *Alpha* support for kubeadm based Windows worker nodes in
253
242
Kube-proxy and CNI plugins are run as Kubernetes pods.
254
243
The feature is maintained by active contributors.
255
244
The feature is tested by the community and feedback is adapted with changes.
256
-
Kubeadm join performs complete preflight checks on the host node
257
-
E2e tests might not be complete but provide good signal.
245
+
e2e tests will be published but may might not be completely green.
258
246
Documentation is in a good state. Kubeadm documentation is edited to point to documentation provided by SIG Windows.
259
247
Kubeadm upgrade is implemented.
260
248
261
249
##### Beta -> GA Graduation
262
250
263
251
The feature is well tested and adapted by the community.
264
-
E2e test provide sufficient coverage.
252
+
e2e tests are stable and consistent with other SIG-Windows CI signals.
265
253
Documentation is complete.
266
254
267
255
### Upgrade / Downgrade Strategy
268
256
269
-
Upgrades and downgrades are out of scope for this proposal for 1.16 but will be revisited in future iterations.
257
+
The provisioning script will be updated as necessary to support newer versions of kubeadm and kubelet, but ideally will
258
+
be parameterized such that it's not heavily tied to specific versions of Kubernetes. Kubeadm doesn't support downgrades and
259
+
so that is out of scope for this feature.
270
260
271
261
### Version Skew Strategy
272
262
273
-
The existing version skew strategy will apply to Windows worker nodes using kubeadm.
274
-
The download scripts will not allow or recommend skewing the version of kube-proxy or the kubelet from the version of kubeadm that is installed by the user.
275
-
If the users applies manual skew by diverging from the recommended setup, the node will be claimed as unsupported.
263
+
Kubeadm's version skew policy will apply to this feature as well.
276
264
277
265
## Implementation History
278
266
@@ -285,7 +273,8 @@ If the users applies manual skew by diverging from the recommended setup, the no
285
273
* May 31, 2019 [PR 78189](https://github.com/kubernetes/kubernetes/pull/78189) Use Service Control Manager as the Windows Initsystem
286
274
* June 3, 2019 [PR 78612](https://github.com/kubernetes/kubernetes/pull/78612) Remove dependency on Kube-Proxy to start after FlannelD
287
275
* July 20,2019 KEP was updated to target Alpha for 1.16
288
-
276
+
* November 1, 2019 [PR 84649](https://github.com/kubernetes/kubernetes/pull/84649) Skip GetPodNetworkStatus when CNI not yet initialized
277
+
* January 15th, 2020 KEP was updated to reflect new approach using Wins as a privileged proxy
0 commit comments