-
Notifications
You must be signed in to change notification settings - Fork 5.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
update DNS programming latency SLI #7756
Conversation
@@ -37,6 +39,10 @@ The reason for doing it this way is feasibility for efficiently computing that: | |||
in 99% of programmers (e.g. iptables). That requires tracking metrics on | |||
per-change base (which we can't do efficiently). | |||
|
|||
- The SLI is expected to remain constant independently of the number of records, per |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the implication of this? Scheduler has a finite throughput. Nodes have finite bandwidth.
If I start a 5 pod headless-service, it's reasonable to expect that DNS for the 5th pod is very very soon after DNS for the 1st:
t0: scale RS to 5
t1: RS controller creates pod 1
t2: scheduler schedules pod 1
t3: kubelet downloads image
t4: kubelet runs pod
t5: runtime assigns an IP
t6: kubelet reports the IP
t7: endpointslice controller observes IP and updates EPSlices
t8: DNS observes EPSlices and updates DNS
t1 - t8 happen 5 times, roughly concurrently, and is likely bounded by image download time.
Change that to 5000 and now you are bounded by the scheduler's throughput. Is DNS not allowed to publish the 1st IP until the last pod is started? How does it know which one is last?
Edit:
Or did you mean something like "The time between pod-started-and-IP-assigned and availble-in-DNS should not be significantly different for the 1st vs. last pod" ? That must be what you meant...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"The time between pod-started-and-IP-assigned and availble-in-DNS should not be significantly different for the 1st vs. last pod" ? That must be what you meant...
this, if you give me the right sentence in english so this is more clear please add it as a suggestion
Co-authored-by: Tim Hockin <thockin@google.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM modulo the nit that is failing presubmit
@@ -37,6 +39,12 @@ The reason for doing it this way is feasibility for efficiently computing that: | |||
in 99% of programmers (e.g. iptables). That requires tracking metrics on | |||
per-change base (which we can't do efficiently). | |||
|
|||
- The SLI for DNS publishing should remain constant independent of the number of records. | |||
For example, in a headless service with thousands of pods the time between the pod being | |||
assigned an IP and the time DNS makes that IP availabe in the service's A/AAAA record(s) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: available
[but it's failing presubmit]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: aojea, thockin, wojtek-t The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Update the SLI to reflect the DNS latency expectactions for headless services, that have a high impact on on AI/ML workloads that make a have use of headless services and DNS kubeflow/mpi-operator#611 (comment)