Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ipvs problem #790

Closed
yannick opened this issue Jan 4, 2020 · 4 comments
Closed

ipvs problem #790

yannick opened this issue Jan 4, 2020 · 4 comments
Labels

Comments

@yannick
Copy link

yannick commented Jan 4, 2020

i'm running kops with kubeproxy in ipvs mode.
after some debugging i found out that pod to pod communication is faulty at least for bigger tcp packets.
which leads to all sorts of funny effects.
e.g. getting just a "not found" http 1.1 reply works, but sending any file bigger than somewhere in between 1.2 and 1.5kb doesnt work.

should a ipvs setup actually work ?

AWS_VPC_K8S_CNI_EXTERNALSNAT=false seems to make it work, so do i need to add my vpc to AWS_VPC_K8S_CNI_EXCLUDE_SNAT_CIDRS ?

i could send a full aws-cni-support.tar.gz if needed

additionally it seems that loadbalancers of type NLB seem not to work

@yannick
Copy link
Author

yannick commented Jan 6, 2020

i made NLBs work by disabling ipvs. however when externalTrafficPolicy: Local is set it doesnt work ( is that a leftover from #75 ? ) ( independent of AWS_VPC_K8S_CNI_EXTERNALSNAT )

@mogren
Copy link
Contributor

mogren commented Jan 10, 2020

Hi @yannick, thanks for reporting the issue. You are right that the CNI plugin doesn't really work with ipvs, but there is another one that should work over at lyft/cni-ipvlan-vpc-k8s. Do you run a lot of services and is that the reason you need ipvs?

AWS_VPC_K8S_CNI_EXTERNALSNAT=false disables SNAT and the use of the conmark, so I think you could be correct that it's related to #75 in some way, but I don't know how yet.

AWS_VPC_K8S_CNI_EXCLUDE_SNAT_CIDRS kind of does the same thing as the the EXTERNALSNAT env variable, except that SNAT is disabled for just the CIDR ranges in that list (and the VPC CIDRs of course), instead of just disabling all SNAT:ing. The main purpose to set it is to not SNAT traffic that goes to peered VPCs.

@yannick
Copy link
Author

yannick commented Jan 11, 2020

hi claes and thanks for the insights.

Do you run a lot of services and is that the reason you need ipvs?

not yet, but i have the urge to settle for the most performant solution. but since i naivly assumed it works i wasted a few hours. i'll try to do a PR for the docs once i understand it better. unfortunately i did not really look at the cni api yet.
i now have a working setup (terraform+kops+nlb->local+ambassador+cert-manager+fluxcd+ecr) with working routing to e.g. kafka/rds/Elasticsearch in another vpc. i'll try to publish it soon.

imo the "classic" (pre k8s) setup, now probably best implemented as: NLB+healthchecks -> nodeport > LB/Ing -> (mostly the local) pod is still very powerful but surprisingly tricky to set up.
of course it implies a few things and works only at certain scales.
one of the problems i saw is that the NLB targets all nodes even, if the LB/router only runs on a subset of nodes. i wonder if that could be targeted through labels/taints on the nods. then again, a few, ever failing, health checks are no drama.

@yannick yannick closed this as completed Jan 11, 2020
@mogren
Copy link
Contributor

mogren commented Jan 11, 2020

Thanks, @yannick seems like a nice setup.

If you use ALB for ingress, we just merged a kubernetes-sigs/aws-load-balancer-controller#1088 yesterday to enable weighted loadbalancing, so that should be available in the next release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants