Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use GC rather than refcounting for VNID policy rules #14560

Merged
merged 1 commit into from
Jun 14, 2017

Conversation

danwinship
Copy link
Contributor

We currently count how many pods/services there are of each VNID on each node so we can delete VNID-related rules when they are no longer needed. But it's hard to guarantee our counts stay in sync, especially in the face of pod creation errors and the like. It would be safer to just periodically scan the OVS flows and remove any that are no longer being used.

(At the moment, we also don't bother to initialize the count correctly on restart if there are existing pods, so just restarting, creating a pod, and then deleting it can cause the VNID rules to get deleted. I'm not sure if this is the problem the reporters are seeing or not.)

Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1454948
Fixes #14092

@openshift/networking PTAL

@danwinship
Copy link
Contributor Author

[test][testextended][extended:networking]

@imcsk8
Copy link
Contributor

imcsk8 commented Jun 10, 2017

Code looks good

Copy link
Contributor

@knobunc knobunc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good stuff Dan. Minor comments.


otx := mp.node.oc.NewTransaction()
otx.DeleteFlows("table=80, reg0=%d, reg1=%d", vnid, vnid)
otx.DeleteFlows("table=80, reg1=%d", vnid)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you loosen this up? (I'm not really concerned by the change, I just want to understand the reason). Will it make it harder to match this up with the AddFlow that we do? I assume it corresponds with the one on the new line 138.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is supposed to be removing all flows that locally deliver traffic with a certain VNID. Any flow with "reg1=${vnid}" meets that definition. As it happens, multitenant doesn't add any reg1 rules that don't also check reg0, but maybe it would in the future. (I was also trying to merge multitenant:RemoveVNIDFlows() and networkpolicy:RemoveVNIDFlows() together, but that doesn't quite work because they still need to update their internal "vnid is in use" state. But maybe the OVS parts of them could be abstracted out later.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, cool. Thanks

}

// A VNID is checked by policy if there is a table 80 rule comparing reg1 to it.
if field, exists := parsed.FindField("reg1"); exists {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we be checking the table here? Or is there a reason not to?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we should

@danwinship
Copy link
Contributor Author

flake #14573, [test]

Copy link
Contributor

@knobunc knobunc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks Dan

@knobunc
Copy link
Contributor

knobunc commented Jun 12, 2017

Flaked on #14573 [testextended][extended:networking]

@knobunc
Copy link
Contributor

knobunc commented Jun 12, 2017

[test]

@danwinship
Copy link
Contributor Author

flake #14597 [test] although it's probably just going to fail again

Copy link
Contributor

@dcbw dcbw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally LGTM, just one question.

}
}
}

go kwait.Forever(node.syncVNIDRules, 24*time.Hour)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this will sync once a day; seems like on a large cluster we could build up quite a list of stale VNID rules. Should we do it like once an hour or 3 or something instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, it's not clear to me how fast stale rules would build up, or how much of a problem they actually are (in terms of OVS performance). So it's hard to say whether once a day is too infrequent. But yeah, that was basically "random number". Anyone else want to vote?

I guess I could make it log how many stale entries it removes and then we can see what different clusters end up doing and adjust it in future releases.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or we could trigger a GC every N "adds"...

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can leaving those stale rules active be a security problem. Alowing pods of different namespace to connect to each other ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm... no... the fact that they're "stale" means there are no pods with those VNIDs on this node, so the rules will never match.

And if the NetNamespace was deleted and a new NetNamespace was created with the same VNID, the old rules would get deleted when the NetNamespace was deleted, so that couldn't be a problem either.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I vote for hourly, with a log saying how many were deleted. We can add the "after N" later if we decide it makes sense.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if I create a pod with ip 10.1.11.17 in namespace A and VNID 2000. I then delete the pod. After that I start making pods in namespace B and VNID 3000 until the ip 10.1.11.17 is reused.
As I understand it at that momenent there are 2 sets of rules in the system.
one for 10.1.11.17 in VNID 2000 and one for 10.1.11.17 in VNID 3000. How does that affect packets from or to address 10.1.11.17. Does it break the boundries between namespace A and B?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. Note that pod/VNID rules are very different in 1.5 than in earlier releases.

The rules referring to IP addresses aren't the ones that are being left behind. The stale rules only refer to VNIDs. So in your example, after creating the first pod, there will be two rules, one that says "if the destination IP is 10.1.11.17 then the destination VNID is 2000" and one that says "if the source VNID is 2000 and the destination VNID is 2000 then allow the traffic". After deleting the pod, the first rule will be deleted but the second will not. If there are no longer any pods with VNID 2000 on the node, then the VNID rule is now useless, but it's also harmless.

@openshift-bot openshift-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 13, 2017
@mhutter
Copy link

mhutter commented Jun 13, 2017

Will this be backported to 1.5?

We're currently stuck at 1.4 with our Origin clusters :)

@knobunc
Copy link
Contributor

knobunc commented Jun 13, 2017

I was thinking about this in the middle of the night and there isn't a problem where we grab the list of rules and start to process it. If there was a stale entry for VNID X but before we get around to deleting it, a new pod in VNID X starts, will we not race and delete the now needed X rule?

@danwinship danwinship force-pushed the vnid-refcounting branch 2 times, most recently from 85bcff9 to 8ee380e Compare June 13, 2017 14:19
@danwinship
Copy link
Contributor Author

@knobunc yes, clearly that's right. Not sure how I convinced myself there was no race...

Reorganized the code a bit so that the entire "sync" process now occurs under the multiTenantPlugin/networkPolicyPlugin lock. So now either (a) SyncVNIDRules will do all of its DeleteFlows before EnsureVNIDRules does its AddFlow, so the flow will get deleted but then re-added, or (b) EnsureVNIDRules will do its AddFlow before SyncVNIDRules calls DumpFlows, so the flow will not be seen as stale.

(But of course you should double-check my logic since I got it wrong last time...)

Copy link
Contributor

@knobunc knobunc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The looking looks right to me.

Copy link

@pravisankar pravisankar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@pravisankar
Copy link

[merge]

@openshift-bot
Copy link
Contributor

openshift-bot commented Jun 13, 2017

continuous-integration/openshift-jenkins/merge Waiting: You are in the build queue at position: 32

@openshift-bot
Copy link
Contributor

Evaluated for origin merge up to 8ee380e

@pravisankar
Copy link

pravisankar commented Jun 13, 2017

test flake #14176
re[test]

@openshift-bot openshift-bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 13, 2017
@openshift-bot
Copy link
Contributor

Evaluated for origin testextended up to 5bbe41e

@openshift-bot
Copy link
Contributor

Evaluated for origin test up to 5bbe41e

@openshift-bot
Copy link
Contributor

continuous-integration/openshift-jenkins/testextended SUCCESS (https://ci.openshift.redhat.com/jenkins/job/test_pull_request_origin_extended/628/) (Base Commit: ee9f436) (Extended Tests: networking)

@openshift-bot
Copy link
Contributor

continuous-integration/openshift-jenkins/test SUCCESS (https://ci.openshift.redhat.com/jenkins/job/test_pull_request_origin/2232/) (Base Commit: ee9f436)

@eparis
Copy link
Member

eparis commented Jun 14, 2017

Clean test but some concerns from eng about the correctness. We also have a 3.5 backport being screamed for. Merging by hand to get maximum QA time.

@eparis eparis merged commit ed504a4 into openshift:master Jun 14, 2017
@aliscott
Copy link

Any update on this being backported to 1.5?

@danwinship
Copy link
Contributor Author

Filed #14801

@danwinship danwinship deleted the vnid-refcounting branch October 19, 2017 19:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants