-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add partial support for IPv6 only mode #4450
Conversation
Hmm. I'm not sure why CI failed but it does not look related. |
Does this work? Last I heard the in-cluster apiserver endpoint was ipv4 only so you can't actually run a cluster with only ipv6 service cidrs. |
Works on which level you mean? When started without Flannel error message like:
of course constantly gets printed to console and kubectl says that node is on "NotReady" state but sure apiserver server is up on level that it responses to commands like and here is example that apiserver is really listening IPv6: $ curl -k https://[::1]:6443
{
"kind": "Status",
"apiVersion": "v1",
"metadata": {
},
"status": "Failure",
"message": "Unauthorized",
"reason": "Unauthorized",
"code": 401
} Also configuration which I'm currently looking for it one where connections inside of cluster and between pods on multi-cluster setup would be IPv6 only but most likely there will be anyway load balancer will expose services also to IPv4 clients. and on theory my custom version of RKE2 https://github.com/olljanat/rke2/releases/tag/v1.22.3-olljanat1 should now support this too but didn't had time to test it yet. |
Hmm. I guess that issue which you are referencing to is that on IPv6 cluster there is no port NAT at all so services like Calico, CoreDNS, etc need to be configured to use port 6443 instead of 443. Other why they cannot connect to apiserver. With Calico solution is set environment variable However from those purposes this flag is proposed as experimental now. |
Yes, the apiserver certainly binds to ipv6 just fine, but the in-cluster Endpoint list for |
I don't think flannel support ipv6 only. Another CNI plugin would need to be used too or we should contribute that to upstream flannel |
It looks to be that on IPv4 world it is kube-proxy which does listen Example of to make it workingHere is example how to make apiserver listening port 443 on way that it works both inside and outside of containers without need kube-proxy to be middle of it. Basic setup# Start K3s with needed parameters:
k3s server --service-cidr=fd:1::/108 --cluster-cidr=fd:2::/64 --ipv6-only --flannel-backend=none --https-listen-port=443 --disable=metrics-server --disable-network-policy --disable traefik
# Remove ip6table line added by kube-proxy
# NOTE! kube-proxy will add this back when you start pods so you need remove it again then
ip6tables -D KUBE-SERVICES -d fd:1::1/128 -p tcp -m comment --comment "default/kubernetes:https has no endpoints" -m tcp --dport 443 -j REJECT --reject-with icmp6-port-unreachable
# Add IPv6 node address to loopback adapter
ip addr add fd:1::1/64 dev lo
# Test that api server respond with IPv6:
curl -k https://[fd:1::1]:443/api/v1/nodes/foo CNI installationDeploy Calico like it is on their guide on https://docs.projectcalico.org/networking/ipv6#enable-ipv6-only Test deploymentDeploy something, here is example with one ASP.NET Core hello world application.
Test inside of podStart shell inside of pod and see how thins looks lie on there: $ ifconfig
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet6 fd12:a9f4:7ee5:7d08:75f2:ae9a:a7e3:4043 prefixlen 128 scopeid 0x0<global>
inet6 fe80::c9e:61ff:fe6b:f903 prefixlen 64 scopeid 0x20<link>
ether 0e:9e:61:6b:f9:03 txqueuelen 0 (Ethernet)
RX packets 9 bytes 906 (906.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 9 bytes 770 (770.0 B)
TX errors 0 dropped 1 overruns 0 carrier 0 collisions 0
lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536
inet 127.0.0.1 netmask 255.0.0.0
inet6 ::1 prefixlen 128 scopeid 0x10<host>
loop txqueuelen 1000 (Local Loopback)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
$ ping kubernetes.default.svc
PING kubernetes.default.svc(kubernetes.default.svc.cluster.local (fd:1::1)) 56 data bytes
64 bytes from kubernetes.default.svc.cluster.local (fd:1::1): icmp_seq=1 ttl=64 time=0.041 ms
$ curl -k https://kubernetes.default.svc:443/api/v1/
{
"kind": "Status",
"apiVersion": "v1",
"metadata": {
},
"status": "Failure",
"message": "Unauthorized",
"reason": "Unauthorized",
"code": 401
}
Yes and as described above also kube-proxy is problematic which why example Cilium with kube-proxy free setup would be good option https://docs.cilium.io/en/v1.9/gettingstarted/kubeproxy-free/ However those are out side of scope of this PR. This just targets to allow users to play with IPv6 only config and find out where those limits are without need to build custom version of K3s. Also my target is get to working with RKE2 on long run which why it need to be solved on here first. |
FYI. Marked this as draft as I figured out intermediate solution on way that I can use MutatingAdmissionWebhook to force services on non-default/non-system namespaces to use IPv6 only like it is described on https://kubernetes.io/docs/concepts/services-networking/dual-stack/#services For record. The problematic places on kube-proxy looks to be:
Also at least on theory it should be possible to use https://github.com/kubernetes-sigs/ip-masq-agent here to solve problem at least partly. EDIT: I just noticed that Calico documentation contains info how to make Kubernetes control plane operate on IPv6 only: https://docs.projectcalico.org/networking/ipv6-control-plane |
Are you perhaps aware if upstream kubernetes (e.g. kube-proxy folks) are trying to add ipv6 only support? |
I expected that would be case based on https://github.com/kubernetes/enhancements/tree/master/keps/sig-network/506-ipv6 but not sure actually as I'm still new on Kubernetes world. |
Thanks |
FYI. It looks to be that on K8s v1.21 and v1.22 dual-stack code is broken on way that its feature gate need to be disabled on IPv6 only nodes. However it looks to be that v1.23 will promote it to stable which why logic is changed on way that as long |
f1651b4
to
9307a80
Compare
@brandond @manuelbuil I managed to get this working at least on that level that I can use vcluster with this loft-sh/vcluster#209 to spin either IPv4 only or IPv6 only K3s clusters to top of dual-stack host cluster (I use RKE2 as host cluster). Kube-proxy also now create correct ip6tables rules for Kubernetes API. However I don't currently have pure IPv6 lab so that scenario might still need some work (at least it need to be tested) but I would prefer to leave those for next PR as that vcluster scenario is very useful. |
How did you fix the kube-proxy issue? |
Key thing with that one is that default node-ip is IPv6 so this rule apply: https://github.com/kubernetes/kubernetes/blob/c1153d3353bd4f4b68d85245d53d2745586be474/cmd/kube-proxy/app/server_others.go#L178-L180 However all those other bindings need to also use IPv6 instead of IPv4 other why other issues appears. |
I have just realized that github.com still does not support ipv6, so I guess you did not download the binary using |
I used these DNS servers on my IPv6 only lab to avoid that issue https://www.nat64.net/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR. I'm actually thinking that creating a utils function and executing it at the beginning in pkg/cli/server/server.go
and saving the result in a variable could be very useful. You could save a lot of lines by just checking that value that'll tell you if we are in a ipv4, dual-stack or ipv6 scenario
BTW, why the title says: |
Because Flannel and network policy need to be disabled and separate CNI plugin to be used. Also I did most of the testing with vcluster which does disable a lot of other features too so it might be that there is still some missing changes to those features https://github.com/loft-sh/vcluster/blob/49258d6242885262fb8b6ee67f7c215f9a3ff67d/charts/k3s/values.yaml#L40-L49 |
I'll test it next week on AWS and see if it works. Hopefully, in the next weeks/months, we can provide ipv6 support to flannel and kube-router network policy controller, let's see! |
Codecov Report
@@ Coverage Diff @@
## master #4450 +/- ##
==========================================
- Coverage 12.03% 11.51% -0.53%
==========================================
Files 135 135
Lines 9179 9248 +69
==========================================
- Hits 1105 1065 -40
- Misses 7838 7960 +122
+ Partials 236 223 -13
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
I tested today in Ubuntu 20 on AWS using nat64 (server: https://www.nat64.net/) to avoid github and dockerhub not working well with ipv6. I deployed a control-plane node and an agent. control-plane config:
agent config:
Deployed calico via the tigera operator using the config:
I enabled ports in the security groups of AWS for ipv6 (e.g. 5473 for typha, 6443 for k3s...) and also disabled source/destination check (required when there is no encapsulation) TESTS
The 4 tests work! 🥳 |
Is there any chance, this PR will be soon validated for a next release. I'd like to build a poc on an IPv6 only k3s cluster. It would be geat if #284 could be closed ? |
Automatically switch to IPv6 only mode if first node-ip is IPv6 address Signed-off-by: Olli Janatuinen <olli.janatuinen@gmail.com>
9307a80
to
966f4d6
Compare
@j-landru thanks for reminder. It have been long time waiting on my todo list. @manuelbuil updated now and hopefully covers all changes which you requested. Note that I don't have IPv6 only lab currently so I was not able to test this anymore but I did my best to make sure that logic should works same as earlier but just moved most of the code to utils. |
Code looks good! I'll try to test it again as soon as I find some time (Monday at latest). Note that we are already on code freeze, so this feature will need to be part of the March release |
I tested ipv6-only, dual-stack and ipv4-only and things work! Of course using the nat64 as explained above and Cilium as the CNI plugin :). +1 from my side but note that this would need to be merged in the March release of k3s |
Proposed Changes
Upstream IPv6 support related work is going on two different tracks.
Dual-stack was recently moved to stable kubernetes/enhancements#2962 which was handled here by #3212
in addition IPv6 only have been on beta couple of years already kubernetes/enhancements#1138 and https://github.com/kubernetes/enhancements/tree/master/keps/sig-network/506-ipv6
This PR add experimental flag--ipv6-only
which let user to bypass IPv4 cluster-cidr and service-cidr requirement added by #3212This PR add new logic which switch to IPv6 only mode automatically when:
a) host does not have IPv4 address
b) only IPv6 service CIDR is given
c) only IPv6 host-ip is given
Types of Changes
Verification
Linked Issues
#284 #2123 #4389
User-Facing Change
Further Comments
Currently Flannel or network policy does not support IPv6 so those are forced to be disabled.
Focus on this PR is add enough logic to allow IPv6 only to be used with vcluster after support is added to that side on loft-sh/vcluster#209