-
Notifications
You must be signed in to change notification settings - Fork 127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Return existing reservation if podRef matched - rebase #383
base: master
Are you sure you want to change the base?
Return existing reservation if podRef matched - rebase #383
Conversation
9163494
to
005481a
Compare
in case allocation already found for pod in this range we will return it, avoiding IP leak Co-Authored-By: Arjun Baindur <15885540+xagent003@users.noreply.github.com>
005481a
to
00ebc0e
Compare
@maiqueb this is the rebase we talked about, please help with the review as well. |
Would you elaborate on this ? Why wasn't this needed before but is now ? |
because after this patch in case podRef already has a allocation in the pool the existing allocation is returned and in the test cases the same podRef was always passed thus the same allocation returned. for example in whereabouts_test.go test case "allocates and releases addresses on ADD/DEL with a Kubernetes backend" it wants to allocate all IP's in the range but it was passing the same podRef so it was only allocating 1 IP in the range and marking the Test as failed. I updated the TC so that it uses unique podRef when calling allocations so to get the full range allocated. also to note I tested in a EKS cluster and everything was working as expected |
One question: when this behavior happens is it essentially the same pod, or a different one ? What is being re-created ? The pod (since it is a stateful set, the "new" pod will have the same name) or the container in the pod ? Are the UIDs the same ? Assuming it is the pod that is re-created, I am worried because this feature seems to be implementing persistent IPs for stateful sets. I am currently working on something similar, but I need need it for other workload types (kubevirt virtual machines). I'm trying to work on a way to implement this feature in a generic way allowing it to work for different types of workloads (stateful sets, virtual machines, etc). FWIW, I've looked at the code, and the solution looks good. @dougbtv could you chime in from your perspective ? |
its new pod with same name, we missed the CMD DEL because the node where original pod was running was crashed. so yes in a sense its persistent IP for statefulset as long as there is no CMD DEL coming for it how I tested was on AWS EKS with Auto Scale Groups, I crash the EC2 instance and wait for ASG to delete the crashed EC2 Instance and create a new one, so fresh instance will be chosen to run the stateful pod, since it has the same name, we allocate the same IP as before. previously in this case, a new IP was allocated, so "same" pod had two allocations, but only using one, and since pod is running the ip-reconciler didnt pick up this scenario, so original ip was leaked, do this enough and you run out of allocations. |
This is a rebase of #296 to return existing reservation in case of ADD from pod that already has a reservation. issue is reproducible when statefulset pod is recovered from a failed node where DEL is missed, thus avoiding IP leak
Fixes #291
Special notes for your reviewer:
during testing we ran into kubernetes/kubernetes#70674 so we had to update in ipam.go to handle the resourceVersion metadata of the overlappingrangeipreservations object.
we needed to update the whereabouts_test to use unique pod names in some of the test cases to complete successfully the test suite