KEP-5471 Extended Toleration Operators for Threshold-Based Placement #5473

helayoty · 2025-08-11T21:48:14Z

One-line PR description: Add numeric comparison operators (Lt, Gt) to Tolerations for SLA-based scheduling with threshold-based taint matching.

Issue link: Extended Toleration Operators for Threshold-Based Placement #5471

Other comments: cc @kubernetes/sig-scheduling-misc @kubernetes/sig-apps-misc

keps/sig-scheduling/5471-enable-sla-based-scheduling/README.md

macsko · 2025-08-22T11:15:40Z

/cc @dom4ha @sanposhiho

macsko · 2025-10-14T11:48:59Z

/lgtm

stlaz · 2025-10-14T13:59:49Z

keps/sig-scheduling/5471-enable-sla-based-scheduling/README.md

+- Upgrade
+  - Enable the feature gate in both API Server and Scheduler.
+- Downgrade
+  - Disable the feature gate in both API Server and Scheduler


What's the correct order of the components to enable the feature gate, then? First the kube-apiserver, then the scheduler? Is the downgrade ordering the same?

stlaz · 2025-10-14T14:06:26Z

keps/sig-scheduling/5471-enable-sla-based-scheduling/README.md

+
+Impact on existing pods with Gt/Lt operators when feature is disabled:
+
+1. **Already-running pods**: Continue running normally. The kubelet doesn't need to re-evaluate tolerations for running pods.


What if somebody wants to update one of the pod's mutable fields/annotations?

As stated in (4.), user won't be able to update the pod at all, even for mutable fields like annotations or labels.

keps/sig-scheduling/5471-enable-sla-based-scheduling/README.md

Signed-off-by: Heba Elayoty <heelayot@microsoft.com>

soltysh

Minor nits, but PRR is mostly complete for alpha.

soltysh · 2025-10-15T11:31:02Z

keps/sig-scheduling/5471-enable-sla-based-scheduling/README.md

+  name: flexible-sla-workload
+spec:
+  tolerations:
+  # Accept nodes with SLA >= 900 (SLA = 900 OR SLA > 900)


Nit: but for consistency Gt is not SLA >= 900, it's SLA > 900, right?

soltysh · 2025-10-15T11:34:58Z

keps/sig-scheduling/5471-enable-sla-based-scheduling/README.md

+
+Extend **core/v1 Toleration** to support **numeric comparison operators** when matching **Node Taints**:
+
+- New operators: `Lt`, `Gt` (in addition to existing `Equal`/`Exists`).


Maybe worth adding that we already use Lt and Gt for node selectors, so our users are familiar with these.

soltysh · 2025-10-15T11:43:19Z

keps/sig-scheduling/5471-enable-sla-based-scheduling/README.md

+- Parse integers only when new operators are used.
+- Existing `Equal`/`Exists` operators execute identical code paths with no additional overhead.
+- Consider caching parsed values in scheduler data structures if performance issues arise
+- Feature gate allows disabling if performance problems occur


There's one additional important mitigation, everybody using numeric values currently will ONLY use the currently available operators. Thus using the numeric operators requires at minimum changing the operator, at which point the validation should kick in and catch the problem. So I hope this should not be a problem. Although the question is what kind of validation currently exists around the operators, if only Exists and Equal were allowed you should be good, if the validation is not that strict the risk is real.

The current validation is strict and it explicity rejects any operator that isn't Equal or Exists. So I believe this mitigation is good, wdyt?

That's great, just add that information to this doc in that case. Strict is always good and helps when we're expanding functionality, like here 😄

soltysh · 2025-10-15T11:46:04Z

keps/sig-scheduling/5471-enable-sla-based-scheduling/README.md

+
+- Clear documentation and examples showing proper numeric taint configuration
+- Enhanced error messages in scheduling events that clearly indicate parsing failures
+- Users can use the metric to set up alerts and monitoring.


How pod with numeric operator will be dealt with in this situation? Iow. node has node.kubernetes.io/sla=high and pod has gt 900, what happens in that case? Are you going to fail the pod? Are you planning to fall-back to the previous behavior?

The pod isn't rejected entirely, but won't match it on that particular taint. I've updated the Notes/Constraints/Caveats section to clear this case and updated the Taint Misconfiguration Detection risk case also.

soltysh · 2025-10-15T11:50:56Z

keps/sig-scheduling/5471-enable-sla-based-scheduling/README.md

+- The toleration filter returns `false` (doesn't match)
+- Pod is considered to have untolerated taints
+- Filter returns `UnschedulableAndUnresolvable` status
+- Pod remains in Pending state.


I believe this answers my previous question about the new operators and how they are treated.

soltysh · 2025-10-15T11:52:41Z

keps/sig-scheduling/5471-enable-sla-based-scheduling/README.md

+4. **General Scheduler Tests:** (`scheduler_test.go`):
+   - Dynamic taint addition/removal
+   - Pod rescheduling after taint changes
+   - Integration with NodeAffinity


Further in the doc you're mentioning feature gate on/off tests, can you mention it here?

soltysh · 2025-10-15T11:54:31Z

keps/sig-scheduling/5471-enable-sla-based-scheduling/README.md

+   - Force deletion may be required: `kubectl delete pod <name> --force --grace-period=0`
+3. **Workload controllers** (Deployments, StatefulSets, etc.):
+   - If the pod template uses Gt/Lt operators, the controller cannot create new pods
+   - Rolling updates will fail


I believe this risk wasn't mentioned earlier. If any of the controllers is trying to use the disabled operators the controller will hot-loop, trying to created a pod that will always fail validation.

Added this risk to the Risks and Mitigations section.

soltysh · 2025-10-15T11:58:25Z

keps/sig-scheduling/5471-enable-sla-based-scheduling/README.md

+     - Users might set wrong field or both fields accidentally
+     - Complex validation logic for field combinations
+     - Memory/storage overhead for additional field
+     - API complexity and documentation burden


I'll play devil's advocate, have you considered using the current mechanism such that it only works based on existing operators? Iow. Node can publish node.kubernetes.io/sla=950, and pods will just use sla equal 950. What are the pros and cons of such approach?

I like this. Added this alternative with all pros/cons. PTAL

Signed-off-by: Heba Elayoty <heelayot@microsoft.com>

soltysh

/approve
the PRR section

soltysh · 2025-10-15T18:51:21Z

keps/sig-scheduling/5471-enable-sla-based-scheduling/README.md

+- Parse integers only when new operators are used.
+- Existing `Equal`/`Exists` operators execute identical code paths with no additional overhead.
+- Consider caching parsed values in scheduler data structures if performance issues arise
+- Feature gate allows disabling if performance problems occur


That's great, just add that information to this doc in that case. Strict is always good and helps when we're expanding functionality, like here 😄

soltysh · 2025-10-15T18:55:39Z

/label tide/merge-method-squash

sanposhiho · 2025-10-16T02:34:48Z

/lgtm
/approve

k8s-ci-robot · 2025-10-16T02:34:59Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: helayoty, sanposhiho, soltysh

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~keps/prod-readiness/OWNERS~~ [soltysh]
~~keps/sig-scheduling/OWNERS~~ [sanposhiho]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot added sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. sig/apps Categorizes an issue or PR as relevant to SIG Apps. labels Aug 11, 2025

github-project-automation bot added this to SIG Apps and SIG Scheduling Aug 11, 2025

k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Aug 11, 2025

github-project-automation bot moved this to Needs Triage in SIG Apps Aug 11, 2025

k8s-ci-robot added the kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory label Aug 11, 2025

k8s-ci-robot requested a review from dom4ha August 11, 2025 21:48

github-project-automation bot moved this to Needs Triage in SIG Scheduling Aug 11, 2025

k8s-ci-robot requested a review from macsko August 11, 2025 21:48

k8s-ci-robot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Aug 11, 2025

This was referenced Aug 9, 2025

Extended Toleration Operators for Threshold-Based Placement #5471

Open

Allow nodes to declare failure probability/SLA kubernetes/kubernetes#118669

Closed

jackfrancis reviewed Aug 11, 2025

View reviewed changes

keps/sig-scheduling/5471-enable-sla-based-scheduling/README.md Outdated Show resolved Hide resolved

jackfrancis reviewed Aug 11, 2025

View reviewed changes

keps/sig-scheduling/5471-enable-sla-based-scheduling/README.md Outdated Show resolved Hide resolved

SergeyKanzhelev reviewed Aug 11, 2025

View reviewed changes

keps/sig-scheduling/5471-enable-sla-based-scheduling/README.md Show resolved Hide resolved

SergeyKanzhelev reviewed Aug 11, 2025

View reviewed changes

keps/sig-scheduling/5471-enable-sla-based-scheduling/README.md Show resolved Hide resolved

everpeace reviewed Aug 12, 2025

View reviewed changes

keps/sig-scheduling/5471-enable-sla-based-scheduling/README.md Show resolved Hide resolved

macsko reviewed Aug 14, 2025

View reviewed changes

keps/sig-scheduling/5471-enable-sla-based-scheduling/README.md Outdated Show resolved Hide resolved

keps/sig-scheduling/5471-enable-sla-based-scheduling/README.md Outdated Show resolved Hide resolved

keps/sig-scheduling/5471-enable-sla-based-scheduling/README.md Show resolved Hide resolved

helayoty force-pushed the helayoty/enable-sla-based-schedule branch from 2a36559 to c9e75ba Compare August 15, 2025 23:18

helayoty requested review from everpeace, jackfrancis and nojnhuh August 15, 2025 23:18

helayoty moved this from Needs Triage to In Progress in SIG Scheduling Aug 15, 2025

helayoty requested review from SergeyKanzhelev and macsko August 16, 2025 00:13

everpeace reviewed Aug 18, 2025

View reviewed changes

keps/sig-scheduling/5471-enable-sla-based-scheduling/README.md Show resolved Hide resolved

k8s-ci-robot requested a review from sanposhiho August 22, 2025 11:15

helayoty requested a review from everpeace August 22, 2025 15:39

helayoty requested a review from stlaz October 13, 2025 17:03

k8s-ci-robot assigned macsko Oct 14, 2025

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 14, 2025

stlaz reviewed Oct 14, 2025

View reviewed changes

address upgrade/roolback feedback

753e147

Signed-off-by: Heba Elayoty <heelayot@microsoft.com>

k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 14, 2025

helayoty requested a review from stlaz October 14, 2025 16:42

soltysh reviewed Oct 15, 2025

View reviewed changes

Address PRR comments

dac7791

Signed-off-by: Heba Elayoty <heelayot@microsoft.com>

k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Oct 15, 2025

helayoty requested a review from soltysh October 15, 2025 14:32

helayoty added 2 commits October 15, 2025 15:14

Remove taints from discoverable options

20465c4

Signed-off-by: Heba Elayoty <heelayot@microsoft.com>

Add fearture gate to integration tests

c35017a

Signed-off-by: Heba Elayoty <heelayot@microsoft.com>

soltysh approved these changes Oct 15, 2025

View reviewed changes

k8s-ci-robot added the tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. label Oct 15, 2025

k8s-ci-robot assigned sanposhiho Oct 16, 2025

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 16, 2025

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 16, 2025

k8s-ci-robot merged commit 2c685f9 into kubernetes:master Oct 16, 2025
4 checks passed

github-project-automation bot moved this from Needs Review to Done in SIG Apps Oct 16, 2025

github-project-automation bot moved this from In Progress to Done in SIG Scheduling Oct 16, 2025

k8s-ci-robot added this to the v1.35 milestone Oct 16, 2025

This was referenced Oct 20, 2025

KEP-5471: Extend tolerations operators kubernetes/kubernetes#134665

Merged

KEP-5471: Remove zero leading risk and update checklist #5663

Closed

helayoty deleted the helayoty/enable-sla-based-schedule branch November 18, 2025 04:36


		Impact on existing pods with Gt/Lt operators when feature is disabled:

		1. Already-running pods: Continue running normally. The kubelet doesn't need to re-evaluate tolerations for running pods.


		Extend core/v1 Toleration to support numeric comparison operators when matching Node Taints:

		- New operators: `Lt`, `Gt` (in addition to existing `Equal`/`Exists`).

KEP-5471 Extended Toleration Operators for Threshold-Based Placement #5473

KEP-5471 Extended Toleration Operators for Threshold-Based Placement #5473

Uh oh!

Conversation

helayoty commented Aug 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

macsko commented Aug 22, 2025

Uh oh!

macsko commented Oct 14, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

soltysh left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

soltysh left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

soltysh commented Oct 15, 2025

Uh oh!

sanposhiho commented Oct 16, 2025

Uh oh!

k8s-ci-robot commented Oct 16, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

10 participants

helayoty commented Aug 11, 2025 •

edited

Loading