-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add HCOMisconfiguredDescheduler runbook #266
Add HCOMisconfiguredDescheduler runbook #266
Conversation
551ea83
to
bb727cf
Compare
<!--USstart--> | ||
If you cannot resolve the issue, see the following resources: | ||
|
||
- [OKD Help](https://www.okd.io/help/) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
page not found
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, it moved to: https://okd.io/docs/community/help.
Fixing only here now.
But we have to fix it (in another PR) on many other runbooks:
docs/deprecated_runbooks/KubeMacPoolDown.md:- OKD Help
docs/deprecated_runbooks/KubevirtHyperconvergedClusterOperatorNMOInUseAlert.md:- OKD Help
docs/deprecated_runbooks/VirtualMachineCRCErrors.md:- OKD Help
docs/runbooks/CDIDataImportCronOutdated.md:- OKD Help
docs/runbooks/CDIDataVolumeUnusualRestartCount.md:- OKD Help
docs/runbooks/CDIDefaultStorageClassDegraded.md:- OKD Help
docs/runbooks/CDIMultipleDefaultVirtStorageClasses.md:- OKD Help
docs/runbooks/CDINoDefaultStorageClass.md:- OKD Help
docs/runbooks/CDINotReady.md:- OKD Help
docs/runbooks/CDIOperatorDown.md:- OKD Help
docs/runbooks/CDIStorageProfilesIncomplete.md:- OKD Help
docs/runbooks/CnaoDown.md:- OKD Help
docs/runbooks/HPPNotReady.md:- OKD Help
docs/runbooks/HPPOperatorDown.md:- OKD Help
docs/runbooks/HPPSharingPoolPathWithOS.md:- OKD Help
docs/runbooks/KubeVirtDeprecatedAPIRequested.md:- OKD Help
docs/runbooks/KubeVirtNoAvailableNodesToRunVMs.md:- OKD Help
docs/runbooks/KubeVirtVMIExcessiveMigrations.md:- OKD Help
docs/runbooks/KubemacpoolDown.md:- OKD Help
docs/runbooks/LowReadyVirtControllersCount.md:- OKD Help
docs/runbooks/LowReadyVirtOperatorsCount.md:- OKD Help
docs/runbooks/LowVirtAPICount.md:- OKD Help
docs/runbooks/LowVirtControllersCount.md:- OKD Help
docs/runbooks/LowVirtOperatorCount.md:- OKD Help
docs/runbooks/NetworkAddonsConfigNotReady.md:- OKD Help
docs/runbooks/NoLeadingVirtOperator.md:- OKD Help
docs/runbooks/NoReadyVirtController.md:- OKD Help
docs/runbooks/NoReadyVirtOperator.md:- OKD Help
docs/runbooks/OrphanedVirtualMachineInstances.md:- OKD Help
docs/runbooks/OutdatedVirtualMachineInstanceWorkloads.md:- OKD Help
docs/runbooks/SSPDown.md:- OKD Help
docs/runbooks/SSPFailingToReconcile.md:- OKD Help
docs/runbooks/SSPHighRateRejectedVms.md:- OKD Help
docs/runbooks/SSPOperatorDown.md:- OKD Help
docs/runbooks/SSPTemplateValidatorDown.md:- OKD Help
docs/runbooks/VMStorageClassWarning.md:- OKD Help
docs/runbooks/VirtAPIDown.md:- OKD Help
docs/runbooks/VirtApiRESTErrorsBurst.md:- OKD Help
docs/runbooks/VirtApiRESTErrorsHigh.md:- OKD Help
docs/runbooks/VirtControllerDown.md:- OKD Help
docs/runbooks/VirtControllerRESTErrorsBurst.md:- OKD Help
docs/runbooks/VirtControllerRESTErrorsHigh.md:- OKD Help
docs/runbooks/VirtHandlerRESTErrorsBurst.md:- OKD Help
docs/runbooks/VirtHandlerRESTErrorsHigh.md:- OKD Help
docs/runbooks/VirtOperatorDown.md:- OKD Help
docs/runbooks/VirtOperatorRESTErrorsBurst.md:- OKD Help
docs/runbooks/VirtOperatorRESTErrorsHigh.md:- OKD Help
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tiraboschi is it possible to add a redirect?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really sure, we should try asking on https://github.com/okd-project/okd-web/
Signed-off-by: Simone Tiraboschi <stirabos@redhat.com>
bb727cf
to
bff087e
Compare
…nfiguredDescheduler Add HCOMisconfiguredDescheduler runbook
What this PR does / why we need it:
Add a runbook for
HCOMisconfiguredDescheduler
A Descheduler is a Kubernetes application that causes the control plane to re-arrange the workloads in a better way.
It operates every pre-defined period and goes back to sleep after it had performed its job.
The descheduler uses the Kubernetes eviction API to evict pods, and receives feedback from
kube-api
whether the eviction request was granted or not.On the other side, in order to keep VM live and trigger live-migration, KubeVirt handles eviction requests in a custom way and unfortunately a live migration takes time.
So from the descheduler's point of view,
virt-launcher
pods fail to be evicted, but they actually migrating to another node in background.The descheduler notes the failure to evict the
virt-launcher
pod and keeps trying to evict other pods, typically resulting in it attempting to evict substantially allvirt-launcher
pods from the node triggering a migration storm.In other words, the way KubeVirt handles eviction requests causes the descheduler to make wrong decisions and take wrong actions that could destabilize the cluster.
Using the descheduler operator with the
LowNodeUtilization
strategy results in unstable/oscillatory behavior if the descheduler is used in this way to migrate VMs.To correctly handle the special case of
VM
pod evicted triggering a live migration to another node, theKube Descheduler Operator
introduced aprofileCustomizations
nameddevEnableEvictionsInBackground
which is currently considered an
alpha
feature onKube Descheduler Operator
side.to prevent unexpected behaviours, if the
Kube Descheduler Operator
is installed and configured alongsideHCO
,HCO
will check its configuration looking for the presence ofdevEnableEvictionsInBackground
profileCustomizations
eventuallysuggesting to the cluster admin to fix the configuration of the
Kube Descheduler Operator
via analert
and its linkedrunbook
.In order to fix the configuration of the
Kube Descheduler Operator
to be suitable also for the KubeVirt use case,something like:
should be merged in its configuration.
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes https://issues.redhat.com/browse/CNV-48734
Special notes for your reviewer:
It's a runbook for kubevirt/hyperconverged-cluster-operator#3100
Checklist
This checklist is not enforcing, but it's a reminder of items that could be relevant to every PR.
Approvers are expected to review this list.
Release note: