@@ -1161,123 +1161,198 @@ implementing this enhancement to ensure the enhancements have also solid foundat
11611161#### Unit Tests
11621162
11631163Unit tests will cover the sanity of code changes that implements the feature,
1164- and the policy controls that are introduced as part of this feature.
1164+ and the policy controls that are introduced as part of this feature. This is
1165+ not exhaustive, but a few specifics are covered below:
1166+
1167+ ##### Allocation Manager
1168+ Tests: https://github.com/kubernetes/kubernetes/blob/ad82c3d39f5e9f21e173ffeb8aa57953a0da4601/pkg/kubelet/allocation/allocation_manager_test.go
1169+
1170+ The allocation manager is responsible for determining whether a resize can be allocated.
1171+ Unit tests cover this logic, including:
1172+ - Resizes with unsupported features such as static cpu/memory memory or swap are marked infeasible.
1173+ - Resizes for which the node does not currently have room for are marked as deferred.
1174+ - Deferred resizes are retried according to the desired priority.
1175+
1176+ ##### Kuberuntime Manager
1177+ Tests:
1178+ - https://github.com/kubernetes/kubernetes/blob/ad82c3d39f5e9f21e173ffeb8aa57953a0da4601/pkg/kubelet/kuberuntime/kuberuntime_manager_test.go#L3048
1179+ - https://github.com/kubernetes/kubernetes/blob/ad82c3d39f5e9f21e173ffeb8aa57953a0da4601/pkg/kubelet/kuberuntime/kuberuntime_manager_test.go#L2320
1180+ - https://github.com/kubernetes/kubernetes/blob/ad82c3d39f5e9f21e173ffeb8aa57953a0da4601/pkg/kubelet/kuberuntime/kuberuntime_manager_test.go#L3290
1181+ - https://github.com/kubernetes/kubernetes/blob/ad82c3d39f5e9f21e173ffeb8aa57953a0da4601/pkg/kubelet/kuberuntime/kuberuntime_manager_test.go#L3668
1182+
1183+ The kuberuntime manager is responsible for actuating a resize after it has been allocated.
1184+ Unit tests cover this logic, including:
1185+ - Validation of the resize, i.e. that memory limits cannot be resized below the usage
1186+ - The logic for determining whether a pod resize is in progress (and that the corresponding pod condition gets added)
1187+ - Computation of what resize actions need to be performed
1188+ - The mock container manager has the expected cgroup values post-resize.
1189+
1190+ ##### CRI uunit tests
11651191
11661192CRI unit tests are updated to reflect use of ContainerResources object in
11671193UpdateContainerResources and ContainerStatus APIs.
11681194
11691195#### Integration tests
11701196
1171- Comprehensive E2E tests provide good coverage for alpha. We may replicate and/or move
1172- some of the E2E tests functionality into integration tests before Beta using data from
1173- any issues we uncover that are not covered by planned and implemented tests.
1197+ Comprehensive E2E tests provide good coverage. The following integration tests are also
1198+ added for additional coverage:
1199+ - https://github.com/kubernetes/kubernetes/blob/ad82c3d39f5e9f21e173ffeb8aa57953a0da4601/test/integration/pods/pods_test.go#L852
1200+ - https://github.com/kubernetes/kubernetes/blob/ad82c3d39f5e9f21e173ffeb8aa57953a0da4601/test/integration/scheduler/queueing/queue.go#L287
11741201
11751202#### Pod Resize E2E Tests
11761203
1204+ ##### How the tests perform verification
1205+
11771206End-to-End tests resize a Pod via PATCH to Pod's Spec.Containers[ i] .Resources.
11781207The e2e tests use docker as container runtime.
11791208 - Resizing of Requests are verified by querying the values in Pod's
11801209 Status.ContainerStatuses[ i] .AllocatedResources field.
11811210 - Resizing of Limits are verified by querying the cgroup limits of the Pod's
11821211 containers.
1212+ - Pending resizes have the corresponding condition set in the Pod Status.
1213+ Completed resizes have their resize status cleared.
1214+
1215+ ##### Success test cases for Guaranteed Pods with one container
1216+
1217+ Tests: https://github.com/kubernetes/kubernetes/blob/ad82c3d39f5e9f21e173ffeb8aa57953a0da4601/test/e2e/common/node/pod_resize.go#L116-L127
11831218
1184- E2E test cases for Guaranteed class Pod with one container:
1219+ For these tests, all pods had a restartable initContainer attached.
1220+
1221+ Resize operations performed:
118512221 . Increase, decrease Requests & Limits for CPU only.
118612231 . Increase, decrease Requests & Limits for memory only.
1187- 1 . Increase, decrease Requests & Limits for CPU and memory.
1188- 1 . Increase CPU and decrease memory.
1189- 1 . Decrease CPU and increase memory.
1190- 1 . Add memory request & limit for CPU only container.
1191- 1 . Remove memory request & limit for CPU & memory container.
1192-
1193- E2E test cases for Burstable class single container Pod that specifies
1194- both CPU & memory:
1195- 1 . Increase, decrease Requests - CPU only.
1196- 1 . Increase, decrease Requests - memory only.
1197- 1 . Increase, decrease Requests - both CPU & memory.
1198- 1 . Increase, decrease Limits - CPU only.
1199- 1 . Increase, decrease Limits - memory only.
1200- 1 . Increase, decrease Limits - both CPU & memory.
1201- 1 . Increase, decrease Requests & Limits - CPU only.
1202- 1 . Increase, decrease Requests & Limits - memory only.
1203- 1 . Increase, decrease Requests & Limits - both CPU and memory.
1204- 1 . Increase CPU (Requests+Limits) & decrease memory(Requests+Limits).
1205- 1 . Decrease CPU (Requests+Limits) & increase memory(Requests+Limits).
1206- 1 . Increase CPU Requests while decreasing CPU Limits.
1207- 1 . Decrease CPU Requests while increasing CPU Limits.
1208- 1 . Increase memory Requests while decreasing memory Limits.
1209- 1 . Decrease memory Requests while increasing memory Limits.
1210- 1 . CPU: increase Requests, decrease Limits, Memory: increase Requests, decrease Limits.
1211- 1 . CPU: decrease Requests, increase Limits, Memory: decrease Requests, increase Limits.
1212- 1 . Set requests == limits, ensure QOS class remains Burstable
1213-
1214- E2E tests for Burstable class single container Pod that specifies CPU only:
1215- 1 . Increase, decrease CPU - Requests only.
1216- 1 . Increase, decrease CPU - Limits only.
1217- 1 . Increase, decrease CPU - both Requests & Limits.
1218-
1219- E2E tests for Burstable class single container Pod that specifies memory only:
1220- 1 . Increase, decrease memory - Requests only.
1221- 1 . Increase, decrease memory - Limits only.
1222- 1 . Increase, decrease memory - both Requests & Limits.
1223-
1224- E2E tests for BestEffort class single container Pod:
1225- 1 . Add CPU requests & limits, QOS class remains BestEffort
1226- 2 . Add Memory requests & limits, QOS class remains BestEffort
1227-
1228- E2E tests for Guaranteed class Pod with three containers (c1, c2, c3):
1229- 1 . Increase CPU & memory for all three containers.
1230- 1 . Decrease CPU & memory for all three containers.
1231- 1 . Increase CPU, decrease memory for all three containers.
1232- 1 . Decrease CPU, increase memory for all three containers.
1233- 1 . Increase CPU for c1, decrease c2, c3 unchanged - no net CPU change.
1234- 1 . Increase memory for c1, decrease c2, c3 unchanged - no net memory change.
1235- 1 . Increase CPU for c1, decrease c2 & c3 - net CPU decrease for Pod.
1236- 1 . Increase memory for c1, decrease c2 & c3 - net memory decrease for Pod.
1237- 1 . Increase CPU for c1 & c3, decrease c2 - net CPU increase for Pod.
1238- 1 . Increase memory for c1 & c3, decrease c2 - net memory increase for Pod.
1239-
1240- E2E tests for sidecar containers
1241- 1 . InitContainer, then sidecar - can increase & decrease CPU & memory of sidecar
1242- 2 . Sidecar then InitContainer - can increase & decrease CPU & memory of sidecar
1243- 3 . Resize sidecar along with container
1244-
1245- #### CRI E2E Tests
1246-
1247- 1 . E2E test is added to verify UpdateContainerResources API with containerd runtime.
1248- 1 . E2E test is added to verify ContainerStatus API using containerd runtime.
1249- 1 . E2E test is added to verify backward compatibility using containerd runtime.
1250-
1251- #### Resource Quota and Limit Ranges
1252-
1253- Setup a namespace with ResourceQuota and a single, valid Pod.
1254- 1 . Resize the Pod within resource quota - CPU only.
1255- 1 . Resize the Pod within resource quota - memory only.
1256- 1 . Resize the Pod within resource quota - both CPU and memory.
1257- 1 . Resize the Pod to exceed resource quota - CPU only.
1258- 1 . Resize the Pod to exceed resource quota - memory only.
1259- 1 . Resize the Pod to exceed resource quota - both CPU and memory.
1260-
1261- Setup a namespace with min and max LimitRange and create a single, valid Pod.
1262- 1 . Increase, decrease CPU within min/max bounds.
1263- 1 . Increase CPU to exceed max value.
1264- 1 . Decrease CPU to go below min value.
1265- 1 . Increase memory to exceed max value.
1266- 1 . Decrease memory to go below min value.
1267-
1268- #### Resize Policy Tests
1269-
1270- Setup a guaranteed class Pod with two containers (c1 & c2).
1271- 1 . No resize policy specified, defaults to NotRequired. Verify that CPU and
1272- memory are resized without restarting containers.
1273- 1 . NotRequired (cpu, memory) policy for c1, RestartContainer (cpu, memory) for c2.
1274- Verify that c1 is resized without restart, c2 is restarted on resize.
1275- 1 . NotRequired cpu, RestartContainer memory policy for c1. Resize c1 CPU only,
1276- verify container is resized without restart.
1277- 1 . NotRequired cpu, RestartContainer memory policy for c1. Resize c1 memory only,
1278- verify container is resized with restart.
1279- 1 . NotRequired cpu, RestartContainer memory policy for c1. Resize c1 CPU & memory,
1280- verify container is resized with restart.
1224+ 1 . Increase, decrease Requests & Limits for CPU and memory in the same direction.
1225+ 1 . Increase, decrease Requests & Limits for CPU and memory in opposite directions.
1226+
1227+ The following cases are tested against all the above resize operations:
1228+ 1 . No restart policy; no resize of init container.
1229+ 1 . No restart policy + resize of init container.
1230+ 1 . Memory restart policy; no resize of init container.
1231+ 1 . CPU restart policy; no resize of init container.
1232+ 1 . CPU + Memory restart policy; no resize of init container.
1233+ 1 . CPU + Memory restart policy + resize of init container.
1234+
1235+ ##### Success test cases for Guaranteed Pods with multiple containers
1236+
1237+ Tests: https://github.com/kubernetes/kubernetes/blob/ad82c3d39f5e9f21e173ffeb8aa57953a0da4601/test/e2e/common/node/pod_resize.go#L130
1238+
1239+ 1 . 3 containers - increase cpu & mem on c1, c2, decrease cpu & mem on c3 - net increase
1240+ 1 . 3 containers - increase cpu & mem on c1, decrease cpu & mem on c2, c3 - net decrease
1241+ 1 . 3 containers - increase: CPU (c1,c3), memory (c2, c3) ; decrease: CPU (c2)
1242+
1243+ ##### Success test cases for Burstable Pods with one container
1244+
1245+ Tests: https://github.com/kubernetes/kubernetes/blob/ad82c3d39f5e9f21e173ffeb8aa57953a0da4601/test/e2e/common/node/pod_resize.go#L208-L220
1246+
1247+ For these tests, there were no initContainers (since that is covered by the Guaranteed Pods cases).
1248+
1249+ Resize operations performed:
1250+ 1 . Increase, decrease CPU Requests
1251+ 1 . Increase, decrease CPU Limits
1252+ 1 . Increase, decrease memory Requests
1253+ 1 . Increase, decrease memory Limits
1254+ 1 . Increase, decrease CPU & memory Requests and Limits in the same direction
1255+ 1 . Increase, decrease CPU and memory in opposite directions
1256+ 1 . Increase, decrease Requests & Limits in opposite directions
1257+
1258+ The following cases are tested against all the above resize operations:
1259+ 1 . No restart policy
1260+ 1 . Memory restart policy
1261+ 1 . CPU restart policy
1262+ 1 . CPU + Memory restart policy
1263+
1264+ ##### Other success test cases for Burstable Pods
1265+
1266+ Tests: https://github.com/kubernetes/kubernetes/blob/ad82c3d39f5e9f21e173ffeb8aa57953a0da4601/test/e2e/common/node/pod_resize.go#L228
1267+
1268+ 1 . 6 containers - various operations performed (including adding limits and requests)
1269+ 1 . Resizing with equivalents (e.g. 2m -> 1m)
1270+
1271+ ##### Memory limit decrease
1272+
1273+ Test: https://github.com/kubernetes/kubernetes/blob/ad82c3d39f5e9f21e173ffeb8aa57953a0da4601/test/e2e/common/node/pod_resize.go#L548
1274+
1275+ This test covers that memory limits can be decreased, but not below the current usage.
1276+
1277+ ##### Patch error tests
1278+
1279+ Tests: https://github.com/kubernetes/kubernetes/blob/ad82c3d39f5e9f21e173ffeb8aa57953a0da4601/test/e2e/common/node/pod_resize.go#L307
1280+
1281+ These tests cover that the following attempts to patch a pod for resize will be rejected by the API server:
1282+ 1 . Best Effort pod - request memory
1283+ 1 . Best Effort pod - request CPU
1284+ 1 . Guaranteed pod - remove cpu & memory limits
1285+ 1 . Burstable pod - remove cpu & memory limits + increase requests
1286+ 1 . Burstable pod - remove memory requests
1287+ 1 . Burstable pod - remove cpu requests
1288+ 1 . Burstable pod - reorder containers
1289+ 1 . Guaranteed pod - rename containers
1290+ 1 . Burstable pod - set requests == limits
1291+ 1 . Burstable pod - resize ephemeral storage
1292+ 1 . Burstable pod - nonrestartable initContainer
1293+
1294+ ##### Scheduler logic tests
1295+
1296+ Tests: https://github.com/kubernetes/kubernetes/blob/ad82c3d39f5e9f21e173ffeb8aa57953a0da4601/test/e2e/node/pod_resize.go#L494
1297+
1298+ These tests cover the scheduler logic with respect to in-place pod resize and the defered / infeasible
1299+ conditions. The flow of this test is:
1300+ 1 . Create pod1 and pod2 on node such that pod1 has enough CPU to be scheduled, but pod2 does not.
1301+ 1 . Resize pod2 down so that it fits on the node and can be scheduled.
1302+ 1 . Verify that pod2 gets scheduled and comes up and running.
1303+ 1 . Create pod3 that requests more CPU than available, verify that it is pending.
1304+ 1 . Resize pod1 down so that pod3 gets room to be scheduled.
1305+ 1 . Verify that pod3 is scheduled and running.
1306+ 1 . attempt to scale up pod1 to requests more CPU than available, verify the resize is deferred.
1307+ 1 . Delete pod2 + pod3 to make room for pod3.
1308+ 1 . Verify that pod1 resize has completed.
1309+ 1 . Attempt to scale up pod1 to request more cpu than the node has, verify the resize is infeasible.
1310+
1311+ ##### Retry of deferred resizes
1312+
1313+ Tests: https://github.com/kubernetes/kubernetes/blob/ad82c3d39f5e9f21e173ffeb8aa57953a0da4601/test/e2e/node/pod_resize.go#L690
1314+
1315+ These tests cover the logic for retrying deferred resizes in the following cases:
1316+ 1 . Deferred resizes succeed after the scale down of another pod. (Deletion case is covered in the previous tests).
1317+ 1 . Deferred resizes are attempted according to the desired priority.
1318+ 1 . Place 4 pods on the node; delete the first one and verify the chain reaction of deferred resizes succeeding. The
1319+ resources are carefully chosen such that
1320+ - deletion of pod1 should make room for pod2's resize (but not pod3 or pod4).
1321+ - pod2's resize should make room for pod3's resize (but not pod4).
1322+ - pod3's resize should make room for pod4's resize.
1323+
1324+ ##### Resource Quota tests
1325+
1326+ Tests: https://github.com/kubernetes/kubernetes/blob/ad82c3d39f5e9f21e173ffeb8aa57953a0da4601/test/e2e/node/pod_resize.go#L47
1327+
1328+ 1 . Exceed max CPU
1329+ 1 . Exceed max memory
1330+ 1 . Exceed max CPU and memory
1331+ 1 . Valid increase of CPU
1332+ 1 . Valid increase of memory
1333+ 1 . Valid increase of CPU and memory
1334+
1335+ ##### Limit Ranger tests
1336+
1337+ Tests: https://github.com/kubernetes/kubernetes/blob/ad82c3d39f5e9f21e173ffeb8aa57953a0da4601/test/e2e/node/pod_resize.go#L218
1338+
1339+ 1 . Exceed max CPU
1340+ 1 . Exceed max memory
1341+ 1 . Exceed max CPU and memory
1342+ 1 . Valid increase of CPU
1343+ 1 . Valid increase of memory
1344+ 1 . Valid increase of CPU and memory
1345+ 1 . Go below min CPU
1346+ 1 . Go below min memory
1347+ 1 . Go below min CPU and memory
1348+ 1 . Valid decrease of CPU
1349+ 1 . Valid decrease of memory
1350+ 1 . Valid decrease of CPU and memory
1351+
1352+ ##### Coverage of the READ and REPLACE endpoints
1353+
1354+ The previous tests are planned to use the PATCH endpoint, but we also need coverage of READ and REPLACE endpoints.
1355+ A basic test will be added that uses REPLACE to perform a resize, and the READ endpoint to verify the result.
12811356
12821357#### Backward Compatibility and Negative Tests
12831358
@@ -1292,7 +1367,6 @@ Setup a guaranteed class Pod with two containers (c1 & c2).
12921367 values of AllocatedResources and ResizePolicy fields being dropped.
129313681 . Verify that only CPU and memory resources are mutable by user.
12941369
1295- TODO: Identify more cases
12961370
12971371### Graduation Criteria
12981372
@@ -1670,12 +1744,14 @@ _This section must be completed when targeting beta graduation to a release._
16701744 - Add instrumentation section
16711745 - Priority of resize requests
16721746- 2025-09-22 - Correct KEP details to match actual implementation
1673- - revert PreferNoRestart resize policy back to NotRequired
1674- - add more details about the resize status
1675- - document kubelet-triggered eviction for critical pods
1676- - update outdated notes regarding static CPU
1677- - correct details about instrumentation
1747+ - revert PreferNoRestart resize policy back to NotRequired
1748+ - add more details about the resize status
1749+ - document kubelet-triggered eviction for critical pods
1750+ - update outdated notes regarding static CPU
1751+ - correct details about instrumentation
16781752- 2025-09-22 - Update in-place pod resize for GA
1753+ - Update test plan
1754+
16791755
16801756## Drawbacks
16811757
0 commit comments