-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"[Feature] KIND support for external network emulation: latency, bandwidth constraint, packet drops" #1780
Comments
discussed with @mauilion and @BenTheElder earlier on slack:
also Since we want to be able to find flakes manually without relying on CI, this would enable more hypothesis driven testing that is locally runnable over time. So , likely more people would locally be able to find problematic k8s tests without relying on sporadic cloud latencys to create isolated, hard to reproduce data points. |
I really love this idea Jay, let's discuss it further. I think that is better to apply the traffic control "outside" of the cluster, applying the The implementation seems "simple", there is lot of literature on how to find the external veth interface of a container, i.e https://github.com/cslev/find_veth_docker., then, once the cluster finish the installation, to avoid causing issues with the bootstrap, we list the external interfaces of the nodes belonging to the cluster and apply the corresponding tc commands.
this will work for any provider, docker, podman, .. For the API I suggest to keep its own block, since the traffic control allow to control another interesting parameters like bandwidth and traffic drops, something like: networking:
netem:
delay: 100ms
bandwidth:
rate: 1M
burst: 25k
loss: 10% |
/retitle "[Feature] KIND support for external network emulation: latency, bandwidth constraint, packet drops" |
/assign @BenTheElder |
Yayyyyyy |
I've been playing with this and it can be done simply with a bash scrip, please take a look. https://gist.github.com/aojea/603e88ba709aac874bd7611752261772 I think that is better to get feedback and then, if there is demand, include it in KIND, |
thanks @aojea ! How would you suggest publishing and standardizing this script, if we dont do it in Kind? I guess we could put it on a personal github repo as a kind recipe of some sort (i have many of my own https://github.com/jayunit100/k8sprototypes/tree/master/kind ) maybe we could join forces or something. ,.. i think if it goes in kind as an option, we'll get alot of great opportunities for newcomers to help simulate CI on their laptops , so it would be a potential big boost to the test-infra folks long term if we build some community around it. |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
Stale issues rot after 30d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
Hello, I am planning to integrate kind with a network emulator, Mininet, in particular. Thanks for @aojea to implement traffic control. But I may need more than this: I would want to setup the network topology like in a datacenter (e.g. fattree topology), and run various network routing protocol. I've got some basic understanding on how Kind implement, and I have two plans to do the implementation. I am wondering if you guys could help me to decide between the two. The basic goal is to replace kind's user-bridged networking with Mininet (actually I use Containernet, where I've modified it to be able to connect running containers, by placing a veth into the running container namespace). Now I am working on the Kind side. Two plans are:
Any feedback would be very appreciated! |
You have more details in this link about how to setup complex scenarios with KIND https://gist.github.com/aojea/00bca6390f5f67c0a30db6acacf3ea91 I suggest you to start with 1), start small and iterate, you can always move from 1) to 2) later ... I don't discard that as soon as you start to be more familiar with the environment and the problematic you'll have new options ;) |
Thank you @aojea !! For 1), could you please provide some hints on which files I need to modify? I found things like this discussion, does this look complete? Also found here saying that the certificate is signed for the old IP address... Do I need to redo the cert generation? ... |
You have a detailed description of the modifications needed here |
@aojea I followed the steps to modify the IP. I am struggling with one step: "When creating the cluster we must add the loopback IP address of the control plane to the certificate SAN (the apiserver binds to "all-interfaces" by default)". Could you please advise how to modify Kind's configuration to modify the certificate SAN (I can't quite find which func in Kind is related to the certificate SAN)? Thank you! |
it has to be patched in kubeadm, in the kind config, replace
I really don't know if is possible to modify it after the installation and how to do it |
Thank you @aojea. The following somehow did not work for me. So I finally chose "plan 3", to just delete the eth0 on kind container, and put veth into the namespace assigning the same IP address. It looks work now :)
|
@fejta-bot: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
I will let @aojea decide if we try to wire this into kindnetd etc or leave it to external extension. |
I demoed this in the kubecon 2021 EU and created a plugin using kind API If there is more traction we can consider to move the feature to the core, but I don't have the feeling this should be part of it right now /close |
@aojea: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
A typical kind cluster has pretty stable networking, locally:
Whereas a real world cluster (for example, a high performance VMC cluster running on EC2 hardware, has a much different performance profile....
It would be nice to be able to disrupt the network bandwidth and throughput on kind clusters so that they matched those of clouds . In especially congested clouds, you can even see iperf values that might be 10X less then this in peak times.... (dont have an example on hand, but if someone can run iperf in a GCE cluster with 20 parallel conformance tests running, i bet youll be able to see this).
What would you like to be added:
Why is this needed:
Kind is increasingly used to simulate realistic clusters.
stress
and so on are used https://kubernetes.slack.com/archives/CN0K3TE2C/p1597351797024800 commonlyThe text was updated successfully, but these errors were encountered: