-
Notifications
You must be signed in to change notification settings - Fork 387
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ARP/NDP responders to external IP assigner #3318
Conversation
ea5bd69
to
c8a5073
Compare
// See the License for the specific language governing permissions and | ||
// limitations under the License. | ||
|
||
package virtual |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But how if we need both of binding IP to an interface and replying ARP (e.g. to fix Egress IPv6 support)? Should we make binding IP and ARP two configurable feature of a single IP assigner?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have merged the assigners into one to fix the Egress issue as well.
1561e81
to
f1f3f5c
Compare
f1f3f5c
to
50f2015
Compare
github.com/mdlayher/arp v0.0.0-20191213142603-f72070a231fc | ||
github.com/mdlayher/ethernet v0.0.0-20190606142754-0394541c37b7 | ||
github.com/mdlayher/ndp v0.0.0-20210831201139-f982b8766fb5 | ||
github.com/mdlayher/raw v0.0.0-20211126142749-4eae47f3d54b |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we import those packages as we have implemented some functions in agent/util/arping and agent/util/arping?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feature does not only need to send GARP or NS over the network interface. We also need to listen to ARP and NS requests and process the queries accordingly. Without these packages, we may need to manage the raw sockets, parse the ethernet headers, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit message:
if the IP were assigned to the dummy interface.
were -> is
This avoids the kernel creating local routes for the external IP, which have
"creates a local route", "which has"
@@ -173,3 +215,14 @@ func (a *ipAssigner) AssignedIPs() sets.String { | |||
// Return a copy. | |||
return a.assignedIPs.Union(nil) | |||
} | |||
|
|||
// Run starts the IP assigner. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
starts ARP responder and NDP responder?
if err := a.loadIPAddresses(); err != nil { | ||
return nil, fmt.Errorf("error when loading IP addresses from the system: %v", err) | ||
if ipv4 != nil { | ||
arpResonder, err := responder.NewARPResponder(externalInterface) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As ARP is not always needed, can we pass a boolean to enable ARP responder? We can add another for NDP too.
Or another we do not enable ARP when dummyDeviceName is not "".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not see you changed this?
if ipv6 != nil { | ||
ndpResponder, err := responder.NewNDPResponder(externalInterface) | ||
if err != nil { | ||
return nil, fmt.Errorf("failed to create ARP responder for link %s: %v", externalInterface.Name, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ARP -> NDP
if dummyDeviceName != "" { | ||
dummyDevice, err := ensureDummyDevice(dummyDeviceName) | ||
if err != nil { | ||
return nil, fmt.Errorf("error when ensuring dummy device exist: %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
exist -> exists
ensuring -> checking
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ensureDummyDevice creates the dummy device if it doesn't exist, not just checks its existence
return r.iface.Name | ||
} | ||
|
||
func (r *arpResponder) AssignIP(ip net.IP) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably rename to "AddIP", "RemoveIP"? It is not really about assignment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
update. thanks.
} | ||
|
||
// GratuitousARP sends an gratuitous ARP packet for the IP. | ||
func (r *arpResponder) Gratuitous(ip net.IP) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about naming it Advertise()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
update. thanks.
if err != nil { | ||
return err | ||
} | ||
return r.conn.WriteTo(pkt, ethernet.Broadcast) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably we should send out multiple GARPs. We can do it in another PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not clear for multiple GARPs
. Could you help to provide more context for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to
antrea/pkg/agent/cniserver/interface_configuration_linux.go
Lines 312 to 326 in 15de4cf
ticker := time.NewTicker(50 * time.Millisecond) | |
defer ticker.Stop() | |
count := 0 | |
for { | |
// Send gratuitous ARP to network in case of stale mappings for this IP address | |
// (e.g. if a previous - deleted - Pod was using the same IP). | |
if err := arping.GratuitousARPOverIface(targetIP, iface); err != nil { | |
klog.Warningf("Failed to send gratuitous ARP #%d: %v", count, err) | |
} | |
count++ | |
if count == 3 { | |
break | |
} | |
<-ticker.C | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the pointer. Do we need this for external IPs to send multiple GARPs to the transport interface?
if pkt.Operation != arp.OperationRequest { | ||
return nil | ||
} | ||
if _, ok := r.assignedIPs[pkt.TargetIP.String()]; !ok { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to protect the map?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it is just a set, we can use a thread-safe type. @tnqn might have a recommendation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggested to use sets.String but it's not thread-safe either. It still needs a lock, but it's a common pattern in our code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we need not a lock for other reasons, we can use sync.Map (which should work well for a set that no need to read the value).
50f2015
to
91d822f
Compare
Codecov Report
@@ Coverage Diff @@
## main #3318 +/- ##
==========================================
- Coverage 60.95% 53.62% -7.33%
==========================================
Files 266 374 +108
Lines 26508 51235 +24727
==========================================
+ Hits 16159 27477 +11318
- Misses 8597 21345 +12748
- Partials 1752 2413 +661
Flags with carried forward coverage won't be shown. Click here to find out more.
|
8de0057
to
72f5bf1
Compare
if err := func() error { | ||
a.mutex.Lock() | ||
defer a.mutex.Unlock() | ||
if a.isIPAssigned(ip) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In Egress, it's possible that multiple Egresses share same static egress IP, then it could happen that two workers are assigning the same IP concurrently. Without the previous lock, the following code may be executed twice for an IP. Is it ok?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As the responders are thread-safe, do you think it is enough by adding checks of the error codes of netlink.AddrAdd/netlink.AddrDel
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if the race condition can happen between AssignIP and UnassignIP. When an IP is released from one Resource and reallocated to another Resource.
worker1 wants to remove an IP while worker2 wants to add it. worker1 starts first.
- worker1 sees IP is assigned and starts removing it without holding the lock.
- worker2 sees IP is assigned and returns directly.
- worker1 removes the IP.
But it seems if worker2 starts first, worker1 may still remove the IP even with the lock. This is not an issue when there is an localIPDetector as removing IP could trigger resync. As it's no longer the case for Service external IP, maybe it needs to track reference number for such case, as well as holding the lock. Then:
If worker1 starts first:
- worker1 sees IP is assigned and starts removing it.
- worker2 waits for the lock.
- worker1 removes IP.
- worker2 gets the lock, and adds the IP back.
If worker2 starts first:
- worker2 sees IP is assigned and increase reference count, returns.
- worker1 sees IP is assigned and decrease reference count, returns.
Reference count can be counted from a set of entities names which are associated with the IP.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added the reference count mechanism for the agent Service external IP controller.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue could still happen without the lock when it's used in EgressController. When Egress A is removed and its IP is allocated to Egress B. worker1 handles removal of A, worker2 handles sync of B.
- worker1 sees IP is assigned and starts removing it without holding the lock.
- worker2 sees IP is already assigned and returns directly.
- worker1 removes the IP, which triggers resync of B.
- worker2 resyncs B, sees IP is already assigned and returns directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sense. I revert the changes to make the assign/unassign operation atomic. Thanks!
defer a.mutex.Unlock() | ||
|
||
if !a.assignedIPs.Has(ip) { | ||
if !a.isIPAssigned(ip) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto, there may be a race condition between Assign and Unassign as well.
} | ||
|
||
// Advertise sends an gratuitous ARP packet for the IP. | ||
func (r *arpResponder) Advertise(ip net.IP) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should AddIP
calls advertise
directly since it's by design assigning an IP should advertise it as well, which could make IPAssigner simpler?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed accordingly. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then it can be private? And we can remove the IP version check as it's an internal method whose argument has been checked by public methods.
if err != nil { | ||
return err | ||
} | ||
return r.conn.WriteTo(pkt, ethernet.Broadcast) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to
antrea/pkg/agent/cniserver/interface_configuration_linux.go
Lines 312 to 326 in 15de4cf
ticker := time.NewTicker(50 * time.Millisecond) | |
defer ticker.Stop() | |
count := 0 | |
for { | |
// Send gratuitous ARP to network in case of stale mappings for this IP address | |
// (e.g. if a previous - deleted - Pod was using the same IP). | |
if err := arping.GratuitousARPOverIface(targetIP, iface); err != nil { | |
klog.Warningf("Failed to send gratuitous ARP #%d: %v", count, err) | |
} | |
count++ | |
if count == 3 { | |
break | |
} | |
<-ticker.C | |
} |
42e1912
to
3e427cd
Compare
if err := func() error { | ||
a.mutex.Lock() | ||
defer a.mutex.Unlock() | ||
if a.isIPAssigned(ip) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if the race condition can happen between AssignIP and UnassignIP. When an IP is released from one Resource and reallocated to another Resource.
worker1 wants to remove an IP while worker2 wants to add it. worker1 starts first.
- worker1 sees IP is assigned and starts removing it without holding the lock.
- worker2 sees IP is assigned and returns directly.
- worker1 removes the IP.
But it seems if worker2 starts first, worker1 may still remove the IP even with the lock. This is not an issue when there is an localIPDetector as removing IP could trigger resync. As it's no longer the case for Service external IP, maybe it needs to track reference number for such case, as well as holding the lock. Then:
If worker1 starts first:
- worker1 sees IP is assigned and starts removing it.
- worker2 waits for the lock.
- worker1 removes IP.
- worker2 gets the lock, and adds the IP back.
If worker2 starts first:
- worker2 sees IP is assigned and increase reference count, returns.
- worker1 sees IP is assigned and decrease reference count, returns.
Reference count can be counted from a set of entities names which are associated with the IP.
} | ||
|
||
// Advertise sends an gratuitous ARP packet for the IP. | ||
func (r *arpResponder) Advertise(ip net.IP) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then it can be private? And we can remove the IP version check as it's an internal method whose argument has been checked by public methods.
klog.InfoS("Assigned IP to ARP responder", "ip", ip, "interface", r.iface.Name) | ||
err := r.Advertise(ip) | ||
if err != nil { | ||
klog.Warningf("failed to advertise for IP %s: %v", ip, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why don't log error?
return r.iface.Name | ||
} | ||
|
||
func (r *ndpResponder) Advertise(ip net.IP) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
3e427cd
to
848c108
Compare
iface *net.Interface | ||
conn *arp.Client | ||
assignedIPs sets.String | ||
mutex sync.Mutex |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sync.Map is indeed a simpler choice here, but I know you need a mutex in ndpResponder for other reasons, so maybe you prefer to use the same way for both.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. I think the mutex here is better for consistency.
848c108
to
83a86d0
Compare
func (a *ipAssigner) Run(ch <-chan struct{}) { | ||
// Start the ARP responder only when the dummy device is not created. The kernel will handle ARP requests | ||
// for IPs assigned to the dummy devices by default. | ||
// TODO: Check arp_ignore sysctl parameter of the transport interface to determin whether to start |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
determin -> determine
the arp_ignore sysctl parameter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks!
83a86d0
to
c9a89ae
Compare
@@ -416,21 +437,38 @@ func TestUpdateService(t *testing.T) { | |||
}, | |||
expectError: false, | |||
}, | |||
{ | |||
name: "Service updated external IP and local Node selected but other Service still owns the assigned IP", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good test case
if err := func() error { | ||
a.mutex.Lock() | ||
defer a.mutex.Unlock() | ||
if a.isIPAssigned(ip) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue could still happen without the lock when it's used in EgressController. When Egress A is removed and its IP is allocated to Egress B. worker1 handles removal of A, worker2 handles sync of B.
- worker1 sees IP is assigned and starts removing it without holding the lock.
- worker2 sees IP is already assigned and returns directly.
- worker1 removes the IP, which triggers resync of B.
- worker2 resyncs B, sees IP is already assigned and returns directly.
Apart from assigning IPs to the dummy interface, this commit will also create raw sockets and listen for ARP requests packets (IPv4) and Neighbor Solicitation packets (IPv6). This fixes the issue that Egress cannot work in IPv6 mode as the system would not reply to Neighbor Advertisement from external interfaces if the IP is assigned to the dummy interface. The IP assigner will skip managing the dummy device if dummyDeviceName is empty. This avoids the kernel creating a local route, which has a higher priority than the routes installed by antrea-proxy in proxyAll mode. Signed-off-by: Tianyi Huang <hty690@126.com>
c9a89ae
to
4918998
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks
/test-ipv6-all |
/test-integration |
1 similar comment
/test-integration |
/test-e2e |
/test-ipv6-e2e |
Apart from assigning IPs to the dummy interface, this commit will also create raw sockets and listen for ARP requests packets (IPv4) and Neighbor Solicitation packets (IPv6). This fixes the issue that Egress cannot work in IPv6 mode as the system would not reply to Neighbor Advertisement from external interfaces if the IP is assigned to the dummy interface. The IP assigner will skip managing the dummy device if dummyDeviceName is empty. This avoids the kernel creating a local route, which has a higher priority than the routes installed by antrea-proxy in proxyAll mode. Signed-off-by: Tianyi Huang <hty690@126.com>
Apart from assigning IPs to the dummy interface, this commit will also create
raw sockets and listen for ARP requests packets (IPv4) and Neighbor Solicitation
packets (IPv6). This fixes the issue that Egress cannot work in IPv6 mode
as the system would not reply to Neighbor Advertisement from external
interfaces if the IP is assigned to the dummy interface. The IP assigner
will skip managing the dummy device if dummyDeviceName is empty. This avoids
the kernel creating a local route, which has a higher priority than the routes
installed by antrea-proxy in proxyAll mode.
Signed-off-by: Tianyi Huang hty690@126.com