-
Notifications
You must be signed in to change notification settings - Fork 367
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Multicast] Use group as flow actions for multicast traffic #3508
Conversation
/test-all |
Codecov Report
@@ Coverage Diff @@
## main #3508 +/- ##
===========================================
- Coverage 64.50% 49.65% -14.86%
===========================================
Files 278 391 +113
Lines 39500 55277 +15777
===========================================
+ Hits 25481 27447 +1966
- Misses 12026 25711 +13685
- Partials 1993 2119 +126
Flags with carried forward coverage won't be shown. Click here to find out more.
|
/test-multicast-e2e |
6becda3
to
995fb39
Compare
/test-all |
67ec708
to
0f7e808
Compare
/test-all |
/test-multicast-e2e |
0f7e808
to
1abbd7d
Compare
1abbd7d
to
6db7759
Compare
/test-all |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the commit message:
Add local pod receivers into an OpenFlow "all" type group for each
multicast group, and use such group in the flow action for multicast
traffic. Remove pod from group buckets if the pod has left the
multicast group or is deleted before leaving multicast group.
"all" type group -> type "all" group
such group in the flow action -> such groups in the flow actions
Remove pod -> Remove a Pod
pod -> Pod
leaving multicast group -> leaving the multicast group
// leave event so that antrea can remove the corresponding interface from local multicast receivers on OVS. This function | ||
// should be called if the removed Pod receiver fails to send IGMP leave message before deletion. | ||
func (c *Controller) removeLocalInterface(ifConfig *interfacestore.InterfaceConfig) { | ||
groupStatuses := c.getGroupMemberStatusesByPod(ifConfig.InterfaceName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have a race condition that a Pod is recreated with the same interface name? Maybe the chance (a new interface is created and even joined groups) is too low to consider. Just asking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The group cache is actually written by only one worker, so here we only constructs the event, and let the writable worker to modify cache.
In theory, the race condition should happens when a Pod with same name and namespace is created before we find the old Pod interfaces is completely removed from memory interface store. That requirs the time to receive the event is longer than preparing for a new Pod. But the event is triggered in memory. I don't think this race condition exists.
6db7759
to
1f3ffd8
Compare
@@ -189,8 +189,16 @@ type OFBridge struct { | |||
multipartReplyChs map[uint32]chan *openflow13.MultipartReply | |||
} | |||
|
|||
func (b *OFBridge) CreateGroupTypeAll(id GroupIDType) Group { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CreateGroupTypeAll(id GroupIDType)
and CreateGroup(id GroupIDType)
confuses me. Why not CreateGroupTypeAll(id GroupIDType)
and CreateGroupTypeSelect(id GroupIDType)
or just CreateGroup(id GroupIDType, , groupType ofctrl.GroupType)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since CreateGroup
is an existing function to use type "select" when creating OpenFlow group, which is called in AntreaProxy function. I don't want to change the existing code unrelated with this change, that's why I didn't rename it. @hongliangl Would you share your point, do you think the existing function CreateGroup
as CreateGroupTypeSelect
is better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can make another PR to rename CreateGroup
to CreateGroupTypeSelect
since we have another type of group.
/test-multicast-e2e |
After checking the test code and running several rounds on e2e testbed,
|
e52a6e0
to
0c43c42
Compare
98fcbe9
to
bdb43b6
Compare
/test-all |
/test-networkpolicy |
bdb43b6
to
87629c0
Compare
/test-all |
pkg/util/channel/channel.go
Outdated
@@ -37,7 +37,20 @@ type Subscriber interface { | |||
|
|||
type Notifier interface { | |||
// Notify sends an event to the channel. | |||
Notify(string) bool | |||
Notify(string, EventType, ...string) bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The channel is designed to be generic. Having EventType would limit its usage scenarios.
Since string cannot meet your requirement. It could use a more generic struct: interface{}
. A specific channel's consumer should convert it to a specific type, just like the generic workqueue and store interfaces.
The type could be defined in pkg/agent/types/event.go:
type CNIEvent struct {
PodNamespace string
PodName string
IsAdd bool
ContainerID string
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sence, I would change. One question is, where would you suggest to place such event types, e.g., "CNIEvent", as it should be accssed by different packages? @tnqn
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By now, I put such event types in package pkg/util/channel
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The package for generic channel is not suitable for concrete event types. I suggested a place in above comment:
The type could be defined in pkg/agent/types/event.go
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And perhaps name it PodUpdate
to avoid confusion between it and the real CNI event.
daa38db
to
57a6bc5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this PR switches to use group for multicast, are the previous configurations like mcast_snooping_enable and mcast-snooping-disable-flood-unregistered still required?
// installedGroups saves the groups which are configured on both OVS and the host. | ||
installedGroups sets.String | ||
installedGroupsMutex sync.RWMutex | ||
mRouteClient *MRouteClient | ||
ovsBridgeClient ovsconfig.OVSBridgeClient | ||
podDeletionCh chan *interfacestore.InterfaceConfig |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
forget to remove?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed.
pkg/agent/openflow/multicast.go
Outdated
group := value.(binding.Group) | ||
group.Reset() | ||
if err := group.Add(); err != nil { | ||
klog.Errorf("Error when replaying cached group %d: %v", id, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
klog.Errorf("Error when replaying cached group %d: %v", id, err) | |
klog.ErrorS(err, "Error when replaying cached group", "group", id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated.
57a6bc5
to
0de756a
Compare
No, these configurations are not required. I removed the call in my latest update. But do you think we should also remove the settings in OVSDB for upgrade case? It doesn't take bad effect even if we don't remove them because we don't use "normal" actions in OpenFlow entries, so the traffic is actually forwarded by Antrea flows but not OVS multicast db cache. |
0de756a
to
1ca467a
Compare
I think no need to remove the setting for ugprade case. It was alpha and harmless as you pointed out. |
1ca467a
to
fa7bca7
Compare
/test-all |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, a minor comment
fa7bca7
to
95758d1
Compare
/test-all |
1. Add local Pod receivers into an OpenFlow type "all" group for each multicast group, and use such groups in the flow actions. Remove a Pod from group buckets if the Pod has left the multicast group or is deleted before leaving the multicast group. 2. Improve multicast e2e tests. Signed-off-by: wenyingd <wenyingd@vmware.com> Co-authored-by: Ruochen Shen <src655@gmail.com>
95758d1
to
f50f7ca
Compare
/test-all |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
multicast group, and use such groups in the flow actions. Remove a Pod
from group buckets if the Pod has left the multicast group or is deleted
before leaving the multicast group.
Signed-off-by: wenyingd wenyingd@vmware.com