Skip to content

Commit defbc75

Browse files
committed
Address the comments
1 parent d17ec21 commit defbc75

File tree

1 file changed

+53
-56
lines changed
  • keps/sig-windows/4802-windows-node-shutdown

1 file changed

+53
-56
lines changed

keps/sig-windows/4802-windows-node-shutdown/README.md

Lines changed: 53 additions & 56 deletions
Original file line numberDiff line numberDiff line change
@@ -293,7 +293,7 @@ n/a
293293
This section must be completed when targeting alpha to a release.
294294
-->
295295

296-
###### How can this feature be enabled / disabled in a live cluster?
296+
* **How can this feature be enabled / disabled in a live cluster?**
297297

298298
- [X] Feature gate (also fill in values in `kep.yaml`)
299299
- Feature gate name: `WindowsGracefulNodeShutdown`
@@ -302,58 +302,55 @@ This section must be completed when targeting alpha to a release.
302302
- Describe the mechanism:
303303
- Will enabling / disabling the feature require downtime of the control
304304
plane?
305-
No
305+
- No
306306
- Will enabling / disabling the feature require downtime or reprovisioning
307307
of a node?
308-
yes (will require restart of kubelet)
308+
- yes (will require restart of kubelet)
309309

310-
###### Does enabling the feature change any default behavior?
310+
* **Does enabling the feature change any default behavior?**
311311

312-
The main behavior change is that during a node shutdown, pods running on the
312+
* The main behavior change is that during a node shutdown, pods running on the
313313
node will be terminated gracefully.
314314

315-
###### Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)?
315+
* **Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)?**
316316

317-
Yes, the feature can be disabled by either disabling the feature gate, or
317+
* Yes, the feature can be disabled by either disabling the feature gate, or
318318
setting `kubeletConfig.ShutdownGracePeriod` to 0 seconds.
319319

320-
###### What happens if we reenable the feature if it was previously rolled back?
320+
* **What happens if we reenable the feature if it was previously rolled back?**
321321

322-
Kubelet will attempt to perform graceful termination of pods during a
323-
node shutdown.
322+
* Kubelet will attempt to perform graceful termination of pods during a
323+
node shutdown.
324324

325-
###### Are there any tests for feature enablement/disablement?
325+
* **Are there any tests for feature enablement/disablement?**
326326

327-
The e2e framework does not currently support enabling or disabling feature
328-
gates.
329-
We have e2e tests to cover the feature when it is enabled and some predefined
330-
setting.
331-
Will add node level integration tests when the node level test framework is available for Windows node
327+
* The e2e framework does not currently support enabling or disabling feature
328+
gates. We have e2e tests to cover the feature when it is enabled and some predefined
329+
setting. Will add node level integration tests when the node level test framework is
330+
available for Windows node
332331

333332
### Rollout, Upgrade and Rollback Planning
334333

335334
<!--
336335
This section must be completed when targeting beta to a release.
337336
-->
338337

339-
###### How can a rollout or rollback fail? Can it impact already running workloads?
338+
* **How can a rollout or rollback fail? Can it impact already running workloads?**
340339

341-
It wil not impact running workloads during rollout/rollback.
340+
* It wil not impact running workloads during rollout/rollback.
342341

343-
###### What specific metrics should inform a rollback?
342+
* **What specific metrics should inform a rollback?**
344343

345-
n/a
346-
347-
The failure of the roll out will behave like disbling this feature, operators can check the kubelet log to get more specific info.
344+
* The failure of the roll out will behave like disbling this feature, operators can check the kubelet log to get more specific info.
348345
ex: `The windows node graceful shutdown has not been enabled, the reasons are xxx`
349346

350-
###### Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?
347+
* **Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?**
351348

352-
This is basically how all features work so upgrade and downgrade apply as normal.
349+
* The feature is part of kubelet config so updating kubelet config should enable/disable the feature; upgrade/downgrade is N/A
353350

354-
###### Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.?
351+
* **Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.?**
355352

356-
No
353+
* No
357354

358355
### Monitoring Requirements
359356

@@ -364,11 +361,11 @@ For GA, this section is required: approvers should be able to confirm the
364361
previous answers based on experience in the field.
365362
-->
366363

367-
###### How can an operator determine if the feature is in use by workloads?
364+
* **How can an operator determine if the feature is in use by workloads?**
368365

369-
Check if the feature gate and kubelet config settings are enabled on a node.
366+
* Check if the feature gate and kubelet config settings are enabled on a node.
370367

371-
###### How can someone using this feature know that it is working for their instance?
368+
* **How can someone using this feature know that it is working for their instance?**
372369

373370
- [ ] Events
374371
- Event Reason:
@@ -378,11 +375,11 @@ Check if the feature gate and kubelet config settings are enabled on a node.
378375
- [X] Other (treat as last resort)
379376
- Details: Pod.Status.Message, Pod.Status.Reason
380377

381-
###### What are the reasonable SLOs (Service Level Objectives) for the enhancement?
378+
* **What are the reasonable SLOs (Service Level Objectives) for the enhancement?**
382379

383-
n/a
380+
* n/a
384381

385-
###### What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service?
382+
* **What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service?**
386383

387384
<!--
388385
Pick one more of these and delete the rest.
@@ -395,19 +392,19 @@ Pick one more of these and delete the rest.
395392
- [X] Other (treat as last resort)
396393
- Details: The operator can get the service health information from the logs
397394

398-
###### Are there any missing metrics that would be useful to have to improve observability of this feature?
395+
* **Are there any missing metrics that would be useful to have to improve observability of this feature?**
399396

400-
n/a
397+
* n/a
401398

402399
### Dependencies
403400

404401
<!--
405402
This section must be completed when targeting beta to a release.
406403
-->
407404

408-
###### Does this feature depend on any specific services running in the cluster?
405+
* **Does this feature depend on any specific services running in the cluster?**
409406

410-
No, this feature doesn't depend on any specific services running the cluster.
407+
* No, this feature doesn't depend on any specific services running the cluster.
411408

412409
### Scalability
413410

@@ -421,33 +418,33 @@ For GA, this section is required: approvers should be able to confirm the
421418
previous answers based on experience in the field.
422419
-->
423420

424-
###### Will enabling / using this feature result in any new API calls?
421+
* **Will enabling / using this feature result in any new API calls?**
425422

426-
No
423+
* No
427424

428-
###### Will enabling / using this feature result in introducing new API types?
425+
* **Will enabling / using this feature result in introducing new API types?**
429426

430-
No
427+
* No
431428

432-
###### Will enabling / using this feature result in any new calls to the cloud provider?
429+
* **Will enabling / using this feature result in any new calls to the cloud provider?**
433430

434-
No
431+
* No
435432

436-
###### Will enabling / using this feature result in increasing size or count of the existing API objects?
433+
* **Will enabling / using this feature result in increasing size or count of the existing API objects?**
437434

438-
No
435+
* No
439436

440-
###### Will enabling / using this feature result in increasing time taken by any operations covered by existing SLIs/SLOs?
437+
* **Will enabling / using this feature result in increasing time taken by any operations covered by existing SLIs/SLOs?**
441438

442-
No
439+
* No
443440

444-
###### Will enabling / using this feature result in non-negligible increase of resource usage (CPU, RAM, disk, IO, ...) in any components?
441+
* **Will enabling / using this feature result in non-negligible increase of resource usage (CPU, RAM, disk, IO, ...) in any components?**
445442

446-
No
443+
* No
447444

448-
###### Can enabling / using this feature result in resource exhaustion of some node resources (PIDs, sockets, inodes, etc.)?
445+
* **Can enabling / using this feature result in resource exhaustion of some node resources (PIDs, sockets, inodes, etc.)?**
449446

450-
No
447+
* No
451448

452449
### Troubleshooting
453450

@@ -462,17 +459,17 @@ splitting it into a dedicated `Playbook` document (potentially with some monitor
462459
details). For now, we leave it here.
463460
-->
464461

465-
###### How does this feature react if the API server and/or etcd is unavailable?
462+
* **How does this feature react if the API server and/or etcd is unavailable?**
466463

467-
The feature does not depend on the API server / etcd.
464+
* The feature does not depend on the API server / etcd.
468465

469-
###### What are other known failure modes?
466+
* **What are other known failure modes?**
470467

471-
n/a
468+
* n/a
472469

473-
###### What steps should be taken if SLOs are not being met to determine the problem?
470+
* **What steps should be taken if SLOs are not being met to determine the problem?**
474471

475-
n/a
472+
* n/a
476473

477474
## Implementation History
478475

0 commit comments

Comments
 (0)