-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use GC rather than refcounting for VNID policy rules #14560
Conversation
[test][testextended][extended:networking] |
Code looks good |
16cb244
to
cc57200
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good stuff Dan. Minor comments.
pkg/sdn/plugin/multitenant.go
Outdated
|
||
otx := mp.node.oc.NewTransaction() | ||
otx.DeleteFlows("table=80, reg0=%d, reg1=%d", vnid, vnid) | ||
otx.DeleteFlows("table=80, reg1=%d", vnid) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did you loosen this up? (I'm not really concerned by the change, I just want to understand the reason). Will it make it harder to match this up with the AddFlow that we do? I assume it corresponds with the one on the new line 138.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is supposed to be removing all flows that locally deliver traffic with a certain VNID. Any flow with "reg1=${vnid}" meets that definition. As it happens, multitenant doesn't add any reg1 rules that don't also check reg0, but maybe it would in the future. (I was also trying to merge multitenant:RemoveVNIDFlows() and networkpolicy:RemoveVNIDFlows() together, but that doesn't quite work because they still need to update their internal "vnid is in use" state. But maybe the OVS parts of them could be abstracted out later.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, cool. Thanks
pkg/sdn/plugin/ovscontroller.go
Outdated
} | ||
|
||
// A VNID is checked by policy if there is a table 80 rule comparing reg1 to it. | ||
if field, exists := parsed.FindField("reg1"); exists { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we be checking the table here? Or is there a reason not to?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we should
cc57200
to
61c99e8
Compare
flake #14573, [test] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks Dan
Flaked on #14573 [testextended][extended:networking] |
[test] |
flake #14597 [test] although it's probably just going to fail again |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally LGTM, just one question.
pkg/sdn/plugin/node.go
Outdated
} | ||
} | ||
} | ||
|
||
go kwait.Forever(node.syncVNIDRules, 24*time.Hour) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this will sync once a day; seems like on a large cluster we could build up quite a list of stale VNID rules. Should we do it like once an hour or 3 or something instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, it's not clear to me how fast stale rules would build up, or how much of a problem they actually are (in terms of OVS performance). So it's hard to say whether once a day is too infrequent. But yeah, that was basically "random number". Anyone else want to vote?
I guess I could make it log how many stale entries it removes and then we can see what different clusters end up doing and adjust it in future releases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or we could trigger a GC every N "adds"...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can leaving those stale rules active be a security problem. Alowing pods of different namespace to connect to each other ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm... no... the fact that they're "stale" means there are no pods with those VNIDs on this node, so the rules will never match.
And if the NetNamespace was deleted and a new NetNamespace was created with the same VNID, the old rules would get deleted when the NetNamespace was deleted, so that couldn't be a problem either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I vote for hourly, with a log saying how many were deleted. We can add the "after N" later if we decide it makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if I create a pod with ip 10.1.11.17 in namespace A and VNID 2000. I then delete the pod. After that I start making pods in namespace B and VNID 3000 until the ip 10.1.11.17 is reused.
As I understand it at that momenent there are 2 sets of rules in the system.
one for 10.1.11.17 in VNID 2000 and one for 10.1.11.17 in VNID 3000. How does that affect packets from or to address 10.1.11.17. Does it break the boundries between namespace A and B?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. Note that pod/VNID rules are very different in 1.5 than in earlier releases.
The rules referring to IP addresses aren't the ones that are being left behind. The stale rules only refer to VNIDs. So in your example, after creating the first pod, there will be two rules, one that says "if the destination IP is 10.1.11.17 then the destination VNID is 2000" and one that says "if the source VNID is 2000 and the destination VNID is 2000 then allow the traffic". After deleting the pod, the first rule will be deleted but the second will not. If there are no longer any pods with VNID 2000 on the node, then the VNID rule is now useless, but it's also harmless.
Will this be backported to 1.5? We're currently stuck at 1.4 with our Origin clusters :) |
I was thinking about this in the middle of the night and there isn't a problem where we grab the list of rules and start to process it. If there was a stale entry for VNID X but before we get around to deleting it, a new pod in VNID X starts, will we not race and delete the now needed X rule? |
85bcff9
to
8ee380e
Compare
@knobunc yes, clearly that's right. Not sure how I convinced myself there was no race... Reorganized the code a bit so that the entire "sync" process now occurs under the multiTenantPlugin/networkPolicyPlugin lock. So now either (a) SyncVNIDRules will do all of its DeleteFlows before EnsureVNIDRules does its AddFlow, so the flow will get deleted but then re-added, or (b) EnsureVNIDRules will do its AddFlow before SyncVNIDRules calls DumpFlows, so the flow will not be seen as stale. (But of course you should double-check my logic since I got it wrong last time...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The looking looks right to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
[merge] |
continuous-integration/openshift-jenkins/merge Waiting: You are in the build queue at position: 32 |
Evaluated for origin merge up to 8ee380e |
test flake #14176 |
8ee380e
to
5bbe41e
Compare
Evaluated for origin testextended up to 5bbe41e |
Evaluated for origin test up to 5bbe41e |
continuous-integration/openshift-jenkins/testextended SUCCESS (https://ci.openshift.redhat.com/jenkins/job/test_pull_request_origin_extended/628/) (Base Commit: ee9f436) (Extended Tests: networking) |
continuous-integration/openshift-jenkins/test SUCCESS (https://ci.openshift.redhat.com/jenkins/job/test_pull_request_origin/2232/) (Base Commit: ee9f436) |
Clean test but some concerns from eng about the correctness. We also have a 3.5 backport being screamed for. Merging by hand to get maximum QA time. |
Any update on this being backported to 1.5? |
Filed #14801 |
We currently count how many pods/services there are of each VNID on each node so we can delete VNID-related rules when they are no longer needed. But it's hard to guarantee our counts stay in sync, especially in the face of pod creation errors and the like. It would be safer to just periodically scan the OVS flows and remove any that are no longer being used.
(At the moment, we also don't bother to initialize the count correctly on restart if there are existing pods, so just restarting, creating a pod, and then deleting it can cause the VNID rules to get deleted. I'm not sure if this is the problem the reporters are seeing or not.)
Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1454948
Fixes #14092
@openshift/networking PTAL