Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[action] [PR:3042] Fix the Orchagent crash seen during Port channel OC test cases. (#3042) #3044

Merged
merged 1 commit into from
Feb 9, 2024

Conversation

mssonicbld
Copy link
Collaborator

The function addNeighbor adds the remote system neighbor against the remote system port and increment the reference count for remote system port's RIF.
However when it adds the nextHop in addNextHop function , it adds it against Inband port with RIF-ID of remote system port, but increases the RIF reference
count of Inband port instead of remote system port.When the neighbor is removed in removeNeighbor, it decreases the ref count of remote system port for RIF.
But when it removes the nexthop in removeNextHop, it decreases the ref count for remote system port. So if the remote system port has both ipv4 and ipv6 configured,
then the ref count is incremented by 2 for remote system port's RIF (ipv4 and ipv4 nbr) and incremented by 2 (ipv4 and ipv6 nexthop) for Inband Port's RIF.
But the ref count is decremented 4 times for remote system port's RIF. So sometimes, as soon as the ipv4 or ipv6 is delted, the orchagent tries to delete the
remote system port's RIF, but since SAI meta layer has different ref count, it returns failure and orchagent crashes.

Signed-off-by: saksarav sakthivadivu.saravanaraj@nokia.com

…c-net#3042)

The function addNeighbor adds the remote system neighbor against the remote system port and increment the reference count for remote system port's RIF.
 However when it adds the nextHop in addNextHop function , it adds it against Inband port with RIF-ID of remote system port, but increases the RIF reference
 count of Inband port instead of remote system port.When the neighbor is removed in removeNeighbor, it decreases the ref count of remote system port for RIF.
 But when it removes the nexthop in removeNextHop, it decreases the ref count for remote system port. So if the remote system port has both ipv4 and ipv6 configured,
 then the ref count is incremented by 2 for remote system port's RIF (ipv4 and ipv4 nbr) and incremented by 2 (ipv4 and ipv6 nexthop) for Inband Port's RIF.
 But the ref count is decremented 4 times for remote system port's RIF. So sometimes, as soon as the ipv4 or ipv6 is delted, the orchagent tries to delete the
 remote system port's RIF, but since SAI meta layer has different ref count, it returns failure and orchagent crashes.

Signed-off-by: saksarav <sakthivadivu.saravanaraj@nokia.com>
@mssonicbld
Copy link
Collaborator Author

Original PR: #3042

@prsunny prsunny merged commit 788def1 into sonic-net:202205 Feb 9, 2024
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants