Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update DNS programming latency SLI #7756

Merged
merged 3 commits into from
Mar 14, 2024
Merged

Conversation

aojea
Copy link
Member

@aojea aojea commented Mar 13, 2024

Update the SLI to reflect the DNS latency expectactions for headless services, that have a high impact on on AI/ML workloads that make a have use of headless services and DNS kubeflow/mpi-operator#611 (comment)

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Mar 13, 2024
@k8s-ci-robot k8s-ci-robot added the sig/scalability Categorizes an issue or PR as relevant to SIG Scalability. label Mar 13, 2024
@aojea
Copy link
Member Author

aojea commented Mar 13, 2024

/assign @wojtek-t @thockin @jkaniuk

@@ -37,6 +39,10 @@ The reason for doing it this way is feasibility for efficiently computing that:
in 99% of programmers (e.g. iptables). That requires tracking metrics on
per-change base (which we can't do efficiently).

- The SLI is expected to remain constant independently of the number of records, per
Copy link
Member

@thockin thockin Mar 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the implication of this? Scheduler has a finite throughput. Nodes have finite bandwidth.

If I start a 5 pod headless-service, it's reasonable to expect that DNS for the 5th pod is very very soon after DNS for the 1st:
t0: scale RS to 5
t1: RS controller creates pod 1
t2: scheduler schedules pod 1
t3: kubelet downloads image
t4: kubelet runs pod
t5: runtime assigns an IP
t6: kubelet reports the IP
t7: endpointslice controller observes IP and updates EPSlices
t8: DNS observes EPSlices and updates DNS

t1 - t8 happen 5 times, roughly concurrently, and is likely bounded by image download time.

Change that to 5000 and now you are bounded by the scheduler's throughput. Is DNS not allowed to publish the 1st IP until the last pod is started? How does it know which one is last?

Edit:

Or did you mean something like "The time between pod-started-and-IP-assigned and availble-in-DNS should not be significantly different for the 1st vs. last pod" ? That must be what you meant...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"The time between pod-started-and-IP-assigned and availble-in-DNS should not be significantly different for the 1st vs. last pod" ? That must be what you meant...

this, if you give me the right sentence in english so this is more clear please add it as a suggestion

@k8s-ci-robot k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Mar 13, 2024
@k8s-ci-robot k8s-ci-robot added size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Mar 14, 2024
Co-authored-by: Tim Hockin <thockin@google.com>
@k8s-ci-robot k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Mar 14, 2024
Copy link
Member

@wojtek-t wojtek-t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM modulo the nit that is failing presubmit

@@ -37,6 +39,12 @@ The reason for doing it this way is feasibility for efficiently computing that:
in 99% of programmers (e.g. iptables). That requires tracking metrics on
per-change base (which we can't do efficiently).

- The SLI for DNS publishing should remain constant independent of the number of records.
For example, in a headless service with thousands of pods the time between the pod being
assigned an IP and the time DNS makes that IP availabe in the service's A/AAAA record(s)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: available

[but it's failing presubmit]

Copy link
Member

@thockin thockin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 14, 2024
@thockin
Copy link
Member

thockin commented Mar 14, 2024

/approve

@wojtek-t
Copy link
Member

/lgtm
/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: aojea, thockin, wojtek-t

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 14, 2024
@k8s-ci-robot k8s-ci-robot merged commit e9f78a3 into kubernetes:master Mar 14, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. sig/scalability Categorizes an issue or PR as relevant to SIG Scalability. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants