-
Notifications
You must be signed in to change notification settings - Fork 667
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Routes with IPv6 link-local address as nexthop are not propagating to hardware #430
Comments
@zhenggen-xu to review |
To support ipv6 link-local, we do have a PR available: It essentially enables the link local for neighbors and deal with the overlapping cases for ip2me and host routes. There was some issues in SAI implementation that could cause crash if we have link local on VLAN member ports, but that should have been fixed in the recent SAI, so this PR will be rebased and resumed soon. In case you really have the same link local address for next-hops on different ports, I agree that we need change the map for m_syncdNextHops to be able to uniquely identify the nexthops with that same ip, currently we don't have this scenario in our DC. nexthop_ids from SAI API call , on the other hand, should have taken the interface into account. BTW: This issue should be created against sonic-swss. |
Thanks for the information. I believe this link-local ping issue too would be addressed when the sonic-net/sonic-swss#437 fixes are merged. Right? |
Yes, that PR should fix the issue you mentioned above. |
Any tentative timeline when this PR may be merged? |
Description
Routing stack running is FRR.
I am advertising BGP IPv6 routes from Peer node with IPv6 link-local NEXTHOP address. Routes are learned in the BGP database and added into Zebra RIB/FIB with IPv6 link-local address as nexthop.
But the routes are not propagated to SAI by orchagent.
We see the below error coming in the logs from orchagent.
[ Jan 1 14:56:18.463637 sonic INFO swss#orchagent: :- addRoute: Failed to get next hop fe80::20c:9fff:fe02:203 for 1096::/98 ]
I see 2 issues to be fixed here before the above issue is addressed:
The neighsyncd is ignoring the kernel netlink notifications about new link-local ipv6 neighbors and hence they are not pushed to NEIGH_TABLE in app db. Only then the next hop table in neighorch module will have the link-local ipv6 neighbor too populated in its nexthop table.
Once this is fixed, the addRoute operation by orchagent succeeds and the route gets eventually pushed to SAI.
The problem will be half addressed with the above change 1.
I see that the m_syncdNextHops (next hops) in neighorch is indexed only by IPAddress. And the m_syncdNextHopGroups (next hop groups) in routeorch is indexed by IPAddresses.
IMO the next hop index should be {ipaddress + interface name} pair and not just ipaddress. Else the IPv6 link-local address cannot be used as a nexthop without knowing the interface index. In DC scenarios, where we see IPv6 link-local BGP peering on multiple links between the same 2 nodes, we see ECMP group with same link-local nexthop on different links. If we store the NextHopGroup indexed by IPAddresses only, we cannot uniquely add all the link-local nexthops in an ECMP group in this scenario.
I am planning to do the above 2 changes in orchagent code in the next few days.
Please let me know if you think this is not the right direction to address these issues.
Steps to reproduce the issue
Describe the results you received
Describe the results you expected
Additional information you deem important (e.g. issue happens only occasionally)
Output of
show version
The text was updated successfully, but these errors were encountered: