Skip to content

Commit

Permalink
Build: (84e3a01) Merge pull request #266 from tiraboschi/add_HCOMisco…
Browse files Browse the repository at this point in the history
…nfiguredDescheduler

Add HCOMisconfiguredDescheduler runbook
  • Loading branch information
sradco committed Sep 19, 2024
1 parent dce3234 commit 0af934a
Show file tree
Hide file tree
Showing 2 changed files with 121 additions and 47 deletions.
73 changes: 73 additions & 0 deletions runbooks/HCOMisconfiguredDescheduler.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
# HCOMisconfiguredDescheduler

## Meaning

A Descheduler is a Kubernetes application that causes the control plane to
re-arrange the workloads in a better way.

The descheduler uses the Kubernetes eviction API to evict pods, and receives
feedback from `kube-api` whether the eviction request was granted or not.
On the other side, in order to keep VM live and trigger live-migration,
KubeVirt handles eviction requests in a custom way and unfortunately
a live migration takes time.
So from the descheduler's point of view, `virt-launcher` pods fail to be
evicted, but they actually migrating to another node in background.
So the way KubeVirt handles eviction requests causes the descheduler to
make wrong decisions and take wrong actions that could destabilize the cluster.


To correctly handle the special case of `VM` pod evicted triggering a live
migration to another node, the `Kube Descheduler Operator` introduced
a `profileCustomizations` named `devEnableEvictionsInBackground`
which is currently considered an `alpha` feature
on `Kube Descheduler Operator` side.

## Impact

Using the descheduler operator for KubeVirt VMs without the
`devEnableEvictionsInBackground` profile customization can lead to
unstable or oscillatory behavior, undermining cluster stability.

## Diagnosis

1. Check the CR for `Kube Descheduler Operator`:

```bash
$ kubectl get -n openshift-kube-descheduler-operator KubeDescheduler cluster -o yaml
```

looking for:

```yaml
spec:
profileCustomizations:
devEnableEvictionsInBackground: true
```

If not there, the `Kube Descheduler Operator` is not correctly configured
to work alongside KubeVirt.

## Mitigation

Set:
```yaml
spec:
profileCustomizations:
devEnableEvictionsInBackground: true
```
on the CR for the `Kube Descheduler Operator` or
remove the `Kube Descheduler Operator`.

Please notice that `EvictionsInBackground` is an alpha feature,
and it's subject to change and currently provided as an
experimental feature.
<!--DS: If you cannot resolve the issue, log in to the
link:https://access.redhat.com[Customer Portal] and open a support case,
attaching the artifacts gathered during the diagnosis procedure.-->
<!--USstart-->
If you cannot resolve the issue, see the following resources:
- [OKD Help](https://okd.io/docs/community/help)
- [#virtualization Slack channel](https://kubernetes.slack.com/channels/virtualization)
<!--USend-->
95 changes: 48 additions & 47 deletions runbooks_index.md
Original file line number Diff line number Diff line change
@@ -1,71 +1,72 @@
# KubeVirt Runbooks

- [SSPTemplateValidatorDown.md](runbooks/SSPTemplateValidatorDown.md)
- [VirtHandlerRESTErrorsHigh.md](runbooks/VirtHandlerRESTErrorsHigh.md)
- [CnaoDown.md](runbooks/CnaoDown.md)
- [HCOInstallationIncomplete.md](runbooks/HCOInstallationIncomplete.md)
- [CnaoNmstateMigration.md](runbooks/CnaoNmstateMigration.md)
- [VirtOperatorRESTErrorsBurst.md](runbooks/VirtOperatorRESTErrorsBurst.md)
- [SSPOperatorDown.md](runbooks/SSPOperatorDown.md)
- [LowVirtAPICount.md](runbooks/LowVirtAPICount.md)
- [OrphanedVirtualMachineInstances.md](runbooks/OrphanedVirtualMachineInstances.md)
- [VMStorageClassWarning.md](runbooks/VMStorageClassWarning.md)
- [VirtControllerDown.md](runbooks/VirtControllerDown.md)
- [KubeVirtCRModified.md](runbooks/KubeVirtCRModified.md)
- [SSPDown.md](runbooks/SSPDown.md)
- [UnsupportedHCOModification.md](runbooks/UnsupportedHCOModification.md)
- [CDIStorageProfilesIncomplete.md](runbooks/CDIStorageProfilesIncomplete.md)
- [HPPOperatorDown.md](runbooks/HPPOperatorDown.md)
- [VirtAPIDown.md](runbooks/VirtAPIDown.md)
- [LowReadyVirtControllersCount.md](runbooks/LowReadyVirtControllersCount.md)
- [CDIOperatorDown.md](runbooks/CDIOperatorDown.md)
- [VMCannotBeEvicted.md](runbooks/VMCannotBeEvicted.md)
- [VirtOperatorDown.md](runbooks/VirtOperatorDown.md)
- [VirtApiRESTErrorsBurst.md](runbooks/VirtApiRESTErrorsBurst.md)
- [VirtHandlerDaemonSetRolloutFailing.md](runbooks/VirtHandlerDaemonSetRolloutFailing.md)
- [CDINotReady.md](runbooks/CDINotReady.md)
- [LowKVMNodesCount.md](runbooks/LowKVMNodesCount.md)
- [HCOMisconfiguredDescheduler.md](runbooks/HCOMisconfiguredDescheduler.md)
- [SSPOperatorDown.md](runbooks/SSPOperatorDown.md)
- [SSPTemplateValidatorDown.md](runbooks/SSPTemplateValidatorDown.md)
- [KubeVirtVMIExcessiveMigrations.md](runbooks/KubeVirtVMIExcessiveMigrations.md)
- [KubeVirtDeprecatedAPIRequested.md](runbooks/KubeVirtDeprecatedAPIRequested.md)
- [VirtOperatorRESTErrorsHigh.md](runbooks/VirtOperatorRESTErrorsHigh.md)
- [CDIDataVolumeUnusualRestartCount.md](runbooks/CDIDataVolumeUnusualRestartCount.md)
- [HPPSharingPoolPathWithOS.md](runbooks/HPPSharingPoolPathWithOS.md)
- [VirtOperatorRESTErrorsBurst.md](runbooks/VirtOperatorRESTErrorsBurst.md)
- [VMCannotBeEvicted.md](runbooks/VMCannotBeEvicted.md)
- [CDIMultipleDefaultVirtStorageClasses.md](runbooks/CDIMultipleDefaultVirtStorageClasses.md)
- [HPPNotReady.md](runbooks/HPPNotReady.md)
- [LowReadyVirtOperatorsCount.md](runbooks/LowReadyVirtOperatorsCount.md)
- [VirtOperatorRESTErrorsHigh.md](runbooks/VirtOperatorRESTErrorsHigh.md)
- [NoReadyVirtOperator.md](runbooks/NoReadyVirtOperator.md)
- [KubevirtVmHighMemoryUsage.md](runbooks/KubevirtVmHighMemoryUsage.md)
- [SingleStackIPv6Unsupported.md](runbooks/SingleStackIPv6Unsupported.md)
- [NoLeadingVirtOperator.md](runbooks/NoLeadingVirtOperator.md)
- [VirtApiRESTErrorsHigh.md](runbooks/VirtApiRESTErrorsHigh.md)
- [VirtControllerRESTErrorsHigh.md](runbooks/VirtControllerRESTErrorsHigh.md)
- [UnsupportedHCOModification.md](runbooks/UnsupportedHCOModification.md)
- [CDINoDefaultStorageClass.md](runbooks/CDINoDefaultStorageClass.md)
- [LowReadyVirtControllersCount.md](runbooks/LowReadyVirtControllersCount.md)
- [LowVirtOperatorCount.md](runbooks/LowVirtOperatorCount.md)
- [HPPOperatorDown.md](runbooks/HPPOperatorDown.md)
- [VirtHandlerRESTErrorsHigh.md](runbooks/VirtHandlerRESTErrorsHigh.md)
- [KubeVirtCRModified.md](runbooks/KubeVirtCRModified.md)
- [CDIStorageProfilesIncomplete.md](runbooks/CDIStorageProfilesIncomplete.md)
- [KubeVirtNoAvailableNodesToRunVMs.md](runbooks/KubeVirtNoAvailableNodesToRunVMs.md)
- [SSPCommonTemplatesModificationReverted.md](runbooks/SSPCommonTemplatesModificationReverted.md)
- [CDIDataVolumeUnusualRestartCount.md](runbooks/CDIDataVolumeUnusualRestartCount.md)
- [NoReadyVirtController.md](runbooks/NoReadyVirtController.md)
- [OutdatedVirtualMachineInstanceWorkloads.md](runbooks/OutdatedVirtualMachineInstanceWorkloads.md)
- [KubeMacPoolDuplicateMacsFound.md](runbooks/KubeMacPoolDuplicateMacsFound.md)
- [SSPHighRateRejectedVms.md](runbooks/SSPHighRateRejectedVms.md)
- [OrphanedVirtualMachineInstances.md](runbooks/OrphanedVirtualMachineInstances.md)
- [VirtHandlerDaemonSetRolloutFailing.md](runbooks/VirtHandlerDaemonSetRolloutFailing.md)
- [SSPFailingToReconcile.md](runbooks/SSPFailingToReconcile.md)
- [CDIDefaultStorageClassDegraded.md](runbooks/CDIDefaultStorageClassDegraded.md)
- [KubevirtVmHighMemoryUsage.md](runbooks/KubevirtVmHighMemoryUsage.md)
- [NoLeadingVirtOperator.md](runbooks/NoLeadingVirtOperator.md)
- [VirtControllerRESTErrorsBurst.md](runbooks/VirtControllerRESTErrorsBurst.md)
- [VirtControllerDown.md](runbooks/VirtControllerDown.md)
- [CnaoDown.md](runbooks/CnaoDown.md)
- [LowVirtControllersCount.md](runbooks/LowVirtControllersCount.md)
- [CDIDataImportCronOutdated.md](runbooks/CDIDataImportCronOutdated.md)
- [HPPNotReady.md](runbooks/HPPNotReady.md)
- [CDIMultipleDefaultVirtStorageClasses.md](runbooks/CDIMultipleDefaultVirtStorageClasses.md)
- [LowKVMNodesCount.md](runbooks/LowKVMNodesCount.md)
- [KubemacpoolDown.md](runbooks/KubemacpoolDown.md)
- [HPPSharingPoolPathWithOS.md](runbooks/HPPSharingPoolPathWithOS.md)
- [VirtAPIDown.md](runbooks/VirtAPIDown.md)
- [VirtApiRESTErrorsBurst.md](runbooks/VirtApiRESTErrorsBurst.md)
- [CDINotReady.md](runbooks/CDINotReady.md)
- [VirtOperatorDown.md](runbooks/VirtOperatorDown.md)
- [VirtHandlerRESTErrorsBurst.md](runbooks/VirtHandlerRESTErrorsBurst.md)
- [KubeMacPoolDuplicateMacsFound.md](runbooks/KubeMacPoolDuplicateMacsFound.md)
- [KubemacpoolDown.md](runbooks/KubemacpoolDown.md)
- [NetworkAddonsConfigNotReady.md](runbooks/NetworkAddonsConfigNotReady.md)
- [KubeVirtDeprecatedAPIRequested.md](runbooks/KubeVirtDeprecatedAPIRequested.md)
- [NoReadyVirtController.md](runbooks/NoReadyVirtController.md)
- [VirtControllerRESTErrorsHigh.md](runbooks/VirtControllerRESTErrorsHigh.md)
- [LowVirtControllersCount.md](runbooks/LowVirtControllersCount.md)
- [VirtApiRESTErrorsHigh.md](runbooks/VirtApiRESTErrorsHigh.md)
- [CnaoNmstateMigration.md](runbooks/CnaoNmstateMigration.md)
- [SingleStackIPv6Unsupported.md](runbooks/SingleStackIPv6Unsupported.md)
- [SSPCommonTemplatesModificationReverted.md](runbooks/SSPCommonTemplatesModificationReverted.md)
- [SSPFailingToReconcile.md](runbooks/SSPFailingToReconcile.md)
- [CDIDefaultStorageClassDegraded.md](runbooks/CDIDefaultStorageClassDegraded.md)
- [OutdatedVirtualMachineInstanceWorkloads.md](runbooks/OutdatedVirtualMachineInstanceWorkloads.md)
- [LowVirtOperatorCount.md](runbooks/LowVirtOperatorCount.md)
- [CDINoDefaultStorageClass.md](runbooks/CDINoDefaultStorageClass.md)
- [LowVirtAPICount.md](runbooks/LowVirtAPICount.md)

## Deprecated Runbooks

- [KubeVirtVMStuckInErrorState.md](runbooks/KubeVirtVMStuckInErrorState.md)
- [KubeVirtVMStuckInStartingState.md](runbooks/KubeVirtVMStuckInStartingState.md)
- [VirtualMachineCRCErrors.md](runbooks/VirtualMachineCRCErrors.md)
- [KubeMacPoolDown.md](runbooks/KubeMacPoolDown.md)
- [KubeVirtComponentExceedsRequestedCPU.md](runbooks/KubeVirtComponentExceedsRequestedCPU.md)
- [KubevirtHyperconvergedClusterOperatorNMOInUseAlert.md](runbooks/KubevirtHyperconvergedClusterOperatorNMOInUseAlert.md)
- [KubeVirtComponentExceedsRequestedMemory.md](runbooks/KubeVirtComponentExceedsRequestedMemory.md)
- [KubeMacPoolDown.md](runbooks/KubeMacPoolDown.md)
- [KubeVirtVMStuckInMigratingState.md](runbooks/KubeVirtVMStuckInMigratingState.md)
- [KubeVirtComponentExceedsRequestedCPU.md](runbooks/KubeVirtComponentExceedsRequestedCPU.md)
- [VirtualMachineCRCErrors.md](runbooks/VirtualMachineCRCErrors.md)
- [KubeVirtComponentExceedsRequestedMemory.md](runbooks/KubeVirtComponentExceedsRequestedMemory.md)
- [KubeVirtVMStuckInStartingState.md](runbooks/KubeVirtVMStuckInStartingState.md)

## Renamed Runbooks

Expand Down

0 comments on commit 0af934a

Please sign in to comment.