Use GC rather than refcounting for VNID policy rules #14560

danwinship · 2017-06-09T20:26:23Z

We currently count how many pods/services there are of each VNID on each node so we can delete VNID-related rules when they are no longer needed. But it's hard to guarantee our counts stay in sync, especially in the face of pod creation errors and the like. It would be safer to just periodically scan the OVS flows and remove any that are no longer being used.

(At the moment, we also don't bother to initialize the count correctly on restart if there are existing pods, so just restarting, creating a pod, and then deleting it can cause the VNID rules to get deleted. I'm not sure if this is the problem the reporters are seeing or not.)

Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1454948
Fixes #14092

@openshift/networking PTAL

danwinship · 2017-06-09T20:26:38Z

[test][testextended][extended:networking]

imcsk8 · 2017-06-10T02:05:19Z

Code looks good

knobunc

Good stuff Dan. Minor comments.

knobunc · 2017-06-12T13:46:36Z

pkg/sdn/plugin/multitenant.go


 otx := mp.node.oc.NewTransaction()
- otx.DeleteFlows("table=80, reg0=%d, reg1=%d", vnid, vnid)
+ otx.DeleteFlows("table=80, reg1=%d", vnid)


Why did you loosen this up? (I'm not really concerned by the change, I just want to understand the reason). Will it make it harder to match this up with the AddFlow that we do? I assume it corresponds with the one on the new line 138.

This function is supposed to be removing all flows that locally deliver traffic with a certain VNID. Any flow with "reg1=${vnid}" meets that definition. As it happens, multitenant doesn't add any reg1 rules that don't also check reg0, but maybe it would in the future. (I was also trying to merge multitenant:RemoveVNIDFlows() and networkpolicy:RemoveVNIDFlows() together, but that doesn't quite work because they still need to update their internal "vnid is in use" state. But maybe the OVS parts of them could be abstracted out later.)

Ok, cool. Thanks

knobunc · 2017-06-12T13:58:41Z

pkg/sdn/plugin/ovscontroller.go

+ }
+
+ // A VNID is checked by policy if there is a table 80 rule comparing reg1 to it.
+ if field, exists := parsed.FindField("reg1"); exists {


Shouldn't we be checking the table here? Or is there a reason not to?

Yes, we should

danwinship · 2017-06-12T17:14:56Z

flake #14573, [test]

knobunc

LGTM, thanks Dan

knobunc · 2017-06-12T17:18:53Z

Flaked on #14573 [testextended][extended:networking]

knobunc · 2017-06-12T17:52:31Z

[test]

danwinship · 2017-06-12T17:54:57Z

flake #14597 [test] although it's probably just going to fail again

dcbw

Generally LGTM, just one question.

dcbw · 2017-06-12T19:44:59Z

pkg/sdn/plugin/node.go

 }
 }
 }

+ go kwait.Forever(node.syncVNIDRules, 24*time.Hour)


So this will sync once a day; seems like on a large cluster we could build up quite a list of stale VNID rules. Should we do it like once an hour or 3 or something instead?

Well, it's not clear to me how fast stale rules would build up, or how much of a problem they actually are (in terms of OVS performance). So it's hard to say whether once a day is too infrequent. But yeah, that was basically "random number". Anyone else want to vote?

I guess I could make it log how many stale entries it removes and then we can see what different clusters end up doing and adjust it in future releases.

Or we could trigger a GC every N "adds"...

Can leaving those stale rules active be a security problem. Alowing pods of different namespace to connect to each other ?

Hm... no... the fact that they're "stale" means there are no pods with those VNIDs on this node, so the rules will never match.

And if the NetNamespace was deleted and a new NetNamespace was created with the same VNID, the old rules would get deleted when the NetNamespace was deleted, so that couldn't be a problem either.

I vote for hourly, with a log saying how many were deleted. We can add the "after N" later if we decide it makes sense.

What happens if I create a pod with ip 10.1.11.17 in namespace A and VNID 2000. I then delete the pod. After that I start making pods in namespace B and VNID 3000 until the ip 10.1.11.17 is reused.
As I understand it at that momenent there are 2 sets of rules in the system.
one for 10.1.11.17 in VNID 2000 and one for 10.1.11.17 in VNID 3000. How does that affect packets from or to address 10.1.11.17. Does it break the boundries between namespace A and B?

No. Note that pod/VNID rules are very different in 1.5 than in earlier releases.

The rules referring to IP addresses aren't the ones that are being left behind. The stale rules only refer to VNIDs. So in your example, after creating the first pod, there will be two rules, one that says "if the destination IP is 10.1.11.17 then the destination VNID is 2000" and one that says "if the source VNID is 2000 and the destination VNID is 2000 then allow the traffic". After deleting the pod, the first rule will be deleted but the second will not. If there are no longer any pods with VNID 2000 on the node, then the VNID rule is now useless, but it's also harmless.

mhutter · 2017-06-13T12:10:38Z

Will this be backported to 1.5?

We're currently stuck at 1.4 with our Origin clusters :)

knobunc · 2017-06-13T13:33:58Z

I was thinking about this in the middle of the night and there isn't a problem where we grab the list of rules and start to process it. If there was a stale entry for VNID X but before we get around to deleting it, a new pod in VNID X starts, will we not race and delete the now needed X rule?

danwinship · 2017-06-13T14:24:35Z

@knobunc yes, clearly that's right. Not sure how I convinced myself there was no race...

Reorganized the code a bit so that the entire "sync" process now occurs under the multiTenantPlugin/networkPolicyPlugin lock. So now either (a) SyncVNIDRules will do all of its DeleteFlows before EnsureVNIDRules does its AddFlow, so the flow will get deleted but then re-added, or (b) EnsureVNIDRules will do its AddFlow before SyncVNIDRules calls DumpFlows, so the flow will not be seen as stale.

(But of course you should double-check my logic since I got it wrong last time...)

knobunc

The looking looks right to me.

pravisankar

LGTM

pravisankar · 2017-06-13T16:04:56Z

[merge]

openshift-bot · 2017-06-13T16:06:20Z

continuous-integration/openshift-jenkins/merge Waiting: You are in the build queue at position: 32

openshift-bot · 2017-06-13T16:06:20Z

Evaluated for origin merge up to 8ee380e

pravisankar · 2017-06-13T17:50:03Z

test flake #14176
re[test]

openshift-bot · 2017-06-14T18:38:34Z

Evaluated for origin testextended up to 5bbe41e

openshift-bot · 2017-06-14T18:38:46Z

Evaluated for origin test up to 5bbe41e

openshift-bot · 2017-06-14T20:22:05Z

continuous-integration/openshift-jenkins/testextended SUCCESS (https://ci.openshift.redhat.com/jenkins/job/test_pull_request_origin_extended/628/) (Base Commit: ee9f436) (Extended Tests: networking)

openshift-bot · 2017-06-14T20:48:27Z

continuous-integration/openshift-jenkins/test SUCCESS (https://ci.openshift.redhat.com/jenkins/job/test_pull_request_origin/2232/) (Base Commit: ee9f436)

eparis · 2017-06-14T21:45:25Z

Clean test but some concerns from eng about the correctness. We also have a 3.5 backport being screamed for. Merging by hand to get maximum QA time.

aliscott · 2017-06-21T08:32:22Z

Any update on this being backported to 1.5?

danwinship · 2017-06-21T14:13:22Z

Filed #14801

danwinship force-pushed the vnid-refcounting branch from 16cb244 to cc57200 Compare June 10, 2017 14:43

knobunc requested changes Jun 12, 2017

View reviewed changes

danwinship force-pushed the vnid-refcounting branch from cc57200 to 61c99e8 Compare June 12, 2017 15:28

knobunc requested review from pravisankar and dcbw June 12, 2017 17:14

knobunc approved these changes Jun 12, 2017

View reviewed changes

dcbw reviewed Jun 12, 2017

View reviewed changes

openshift-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 13, 2017

danwinship force-pushed the vnid-refcounting branch 2 times, most recently from 85bcff9 to 8ee380e Compare June 13, 2017 14:19

knobunc approved these changes Jun 13, 2017

View reviewed changes

pravisankar approved these changes Jun 13, 2017

View reviewed changes

openshift-bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 13, 2017

Use GC rather than refcounting for VNID policy rules

5bbe41e

danwinship force-pushed the vnid-refcounting branch from 8ee380e to 5bbe41e Compare June 14, 2017 18:33

eparis merged commit ed504a4 into openshift:master Jun 14, 2017

danwinship mentioned this pull request Jun 21, 2017

[1.5] Use GC rather than refcounting for VNID policy rules #14801

Merged

danwinship deleted the vnid-refcounting branch October 19, 2017 19:04

Use GC rather than refcounting for VNID policy rules #14560

Use GC rather than refcounting for VNID policy rules #14560

Conversation

danwinship commented Jun 9, 2017

danwinship commented Jun 9, 2017

imcsk8 commented Jun 10, 2017

knobunc left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

danwinship commented Jun 12, 2017

knobunc left a comment

Choose a reason for hiding this comment

knobunc commented Jun 12, 2017

knobunc commented Jun 12, 2017

danwinship commented Jun 12, 2017

dcbw left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mhutter commented Jun 13, 2017

knobunc commented Jun 13, 2017

danwinship commented Jun 13, 2017

knobunc left a comment

Choose a reason for hiding this comment

pravisankar left a comment

Choose a reason for hiding this comment

pravisankar commented Jun 13, 2017

openshift-bot commented Jun 13, 2017 • edited Loading

openshift-bot commented Jun 13, 2017

pravisankar commented Jun 13, 2017 • edited by danwinship Loading

openshift-bot commented Jun 14, 2017

openshift-bot commented Jun 14, 2017

openshift-bot commented Jun 14, 2017

openshift-bot commented Jun 14, 2017

eparis commented Jun 14, 2017

aliscott commented Jun 21, 2017

danwinship commented Jun 21, 2017

openshift-bot commented Jun 13, 2017 •

edited

Loading

pravisankar commented Jun 13, 2017 •

edited by danwinship

Loading