Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[routesync] Fix for stale dynamic neighbor #2553

Merged
merged 2 commits into from
Jan 12, 2023

Conversation

vganesan-nokia
Copy link
Contributor

What I did

Changes are done to delete those route entries from APPL_DB which have next hop on eth0 or docker0 interface.

Why I did it

This is done to fix the issue sonic-net/sonic-buildimage#12442.
For more information about root cause and the solution, please see details below

How I verified it

Tested using the following scenario

  • Establish eBGP sessions and advertise default route (IPv4 or IPv6).
  • Make sure that default routes exists in kernel, APPL_DB and ASIC_DB as expected.
    -Shutdown interfaces such that all the eBGP neighbors that advertised the default route are down.
  • Dump the routes in the kernel and make sure that the defautl route has only one next on the "eth0"
  • Clear the dynamically learned neighbors using "sonic-clearp ndp" or "sonic-clear arp" command.
  • Dump the neighbors in the linux kernel using "ip neigh show" command
  • Dump neighbors from ASIC_DB
  • All the dynamically learned neighbors are cleared from linux kernel, APPL_DB and ASIC_DB when the neighbors are flushed after shutting down the eBGP neighbor interfaces.

Details if related

The root cause of this problem is that some routes learned from eBGP neighbors are not cleared from APPL_DB when the eBGP neighbors go down. The routes which show this issue are those routes which include the next hop on interface "eth0". This consistently occurs for default routes for both IPv4 and IPv6 in any asic instance. Since the route is not deleted from APPL DB (though route is deleted from kernel), the neighbor ref_count is not 0 and hence the neighbor is not deleted from the ASIC_DB.

All asic instances include the default route with next hop on eth0 (with large metric to avoid colliding with default routes learned via eBGP). The APPL_DB has this route with only the eBGP next hops due to a filtering in routesync that filters the kernel route updates that has a next hop on eth0 or docker0. Consequently, the next hop on eth0 is not included in the route entry in APPL_DB and ASIC_DB. When all the eBGP neighbor interfaces are shut down, all eBGP neighbors go down. When all eBGP neighbors go down, the default route is withdrawn. The next hop on eth0 becomes the only next hop for the default route. When kernel sends this update, because of the the above mentioned routesync filter the route update is not sent to APPL_DB. Hence the APPL_DB is left with a stale route entry with whatever next hops it had before the eth0 next hop became only next hop. Since these next hops still have this route referenced, these are not cleared when neighbors are flushed.

The soultion to both of these issues viz., (1) stale route entry in APPL_DB and ASIC_DB and (2) stale neighbor entry in ASIC_DB is to delete the routes from APPL_DB when kernel updates are received with next hop on "eth0" as the only next hop.

@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Nov 29, 2022

CLA Signed

The committers listed above are authorized under a signed CLA.

  • ✅ login: vganesan-nokia (69ecd73)

@prsunny
Copy link
Collaborator

prsunny commented Dec 6, 2022

@vganesan-nokia , i think you may want to follow the CLA guidelines to sign.

@vganesan-nokia
Copy link
Contributor Author

/easycla

abdosi
abdosi previously approved these changes Dec 13, 2022
@abdosi
Copy link
Contributor

abdosi commented Dec 13, 2022

@prsunny please check on this.

arlakshm
arlakshm previously approved these changes Dec 14, 2022
@abdosi
Copy link
Contributor

abdosi commented Dec 14, 2022

/azp run Azure.sonic-swss

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@prsunny
Copy link
Collaborator

prsunny commented Dec 15, 2022

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@prsunny
Copy link
Collaborator

prsunny commented Dec 16, 2022

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@arlakshm
Copy link
Contributor

/azp run Azure.sonic-swss

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@prsunny
Copy link
Collaborator

prsunny commented Dec 19, 2022

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@prsunny
Copy link
Collaborator

prsunny commented Dec 23, 2022

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@arlakshm
Copy link
Contributor

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@prsunny
Copy link
Collaborator

prsunny commented Dec 29, 2022

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@prsunny
Copy link
Collaborator

prsunny commented Jan 3, 2023

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@bingwang-ms
Copy link
Contributor

/azp run

@vganesan-nokia
Copy link
Contributor Author

@prsunny: Rebased and force pushed. I still see vstest failures (test_warmreboot.py).

@prsunny
Copy link
Collaborator

prsunny commented Jan 6, 2023

ok, seems like this test is consistently failing on the warmboot, can you please run the test manually?

Signed-off-by: vedganes <veda.ganesan@nokia.com>

This commit fixes the warm restart unit test failure. When the route with
only nh on eth0 or docker0 is removed and if the route is the default
route, orchagent sends "add" black hole route to the syncd. So the ASIC
DB gets n hset message. When this happens during warm restart, the unit
test identifies this as unwanted setting and the unit test fails. To fix
this issues, the route delete is sent only if the warm restart is not in
progress. This is done following the same warm restart handling approach
used for route delete in other palces.
@vganesan-nokia
Copy link
Contributor Author

@prsunny: Fixed unit test failures. Please make arrangements to get this reviewed.

@arlakshm
Copy link
Contributor

@prsunny, can you check the change related to warm reboot.

@prsunny prsunny merged commit 4ebdad1 into sonic-net:master Jan 12, 2023
yxieca pushed a commit that referenced this pull request Jan 12, 2023
* [routesync] Fix for stale dynamic neighbor

The changes are for fixing stale neighbor in the ASIC_DB and data path
when eBGP neighbors are shutdown and neighbors are flushed. The problem
is described in issue: sonic-net/sonic-buildimage#12442
The root cause of this issue is due to not deleing the route from the
ASIC_DB when the route's next hop is on eth0 or docker0 interface. The solution is
to delete the route entry from ASIC_DB instead of just returning when the route's
next hop is on the interface eth0 or docker0


This commit fixes the warm restart unit test failure. When the route with
only nh on eth0 or docker0 is removed and if the route is the default
route, orchagent sends "add" black hole route to the syncd. So the ASIC
DB gets n hset message. When this happens during warm restart, the unit
test identifies this as unwanted setting and the unit test fails. To fix
this issues, the route delete is sent only if the warm restart is not in
progress. This is done following the same warm restart handling approach
used for route delete in other palces.

Signed-off-by: vedganes <veda.ganesan@nokia.com>
dprital added a commit to dprital/sonic-buildimage that referenced this pull request Jan 30, 2023
Update sonic-swss submodule pointer to include the following:
* a2a483d [acl] Add new ACL key BTH_OPCODE and AETH_SYNDROME  ([sonic-net#2617](sonic-net/sonic-swss#2617))
* 9d1f66b [bfdorch] add local discriminator to state DB ([sonic-net#2629](sonic-net/sonic-swss#2629))
* c54b3d1 Vxlan tunnel endpoint custom monitoring APPL DB table. ([sonic-net#2589](sonic-net/sonic-swss#2589))
* 7f03db2 Fix potential risks ([sonic-net#2516](sonic-net/sonic-swss#2516))
* 383ee68 [refactor]Refactoring sai handle status ([sonic-net#2621](sonic-net/sonic-swss#2621))
* cd95972 Fix issue 13341 ARP entry can be out of sync between kernel and APPL_DB if multiple updates are received from RTNL ([sonic-net#2619](sonic-net/sonic-swss#2619))
* a01470f Remove TODO comments that are no longer relevant ([sonic-net#2622](sonic-net/sonic-swss#2622))
* d058390 Changed the BFD default detect multiplier to 10x ([sonic-net#2614](sonic-net/sonic-swss#2614))
* d78b528 [MuxOrch] Enabling neighbor when adding in active state ([sonic-net#2601](sonic-net/sonic-swss#2601))
* 4ebdad1 [routesync] Fix for stale dynamic neighbor ([sonic-net#2553](sonic-net/sonic-swss#2553))
* 8857f92 Added new attributes for Vnet and Vxlan ecmp configurations. ([sonic-net#2584](sonic-net/sonic-swss#2584))
* b6bbc3e Revert [voq][chassis]Add show fabric counters port/queue commands (2522) ([sonic-net#2611](sonic-net/sonic-swss#2611))
* 52406e2 Add missing parameter to on_switch_shutdown_request method. ([sonic-net#2567](sonic-net/sonic-swss#2567))
* 4ac9ad9 Increase diff coverage to 80% ([sonic-net#2599](sonic-net/sonic-swss#2599))
* 8a0bb36 Handle Mac address 'none' ([sonic-net#2593](sonic-net/sonic-swss#2593))
* f496ab3 [vstest] Only collect stdout of orchagent_restart_check in vstest ([sonic-net#2597](sonic-net/sonic-swss#2597))
* 1dab495 Avoid aborting orchagent when setting TUNNEL attributes ([sonic-net#2591](sonic-net/sonic-swss#2591))
* 4395cea Fix neighbor doesn't update all attribute ([sonic-net#2577](sonic-net/sonic-swss#2577))

Signed-off-by: dprital <drorp@nvidia.com>
liat-grozovik pushed a commit to sonic-net/sonic-buildimage that referenced this pull request Feb 1, 2023
Update sonic-swss submodule pointer to include the following:
* a2a483d [acl] Add new ACL key BTH_OPCODE and AETH_SYNDROME  ([#2617](sonic-net/sonic-swss#2617))
* 9d1f66b [bfdorch] add local discriminator to state DB ([#2629](sonic-net/sonic-swss#2629))
* c54b3d1 Vxlan tunnel endpoint custom monitoring APPL DB table. ([#2589](sonic-net/sonic-swss#2589))
* 7f03db2 Fix potential risks ([#2516](sonic-net/sonic-swss#2516))
* 383ee68 [refactor]Refactoring sai handle status ([#2621](sonic-net/sonic-swss#2621))
* cd95972 Fix issue 13341 ARP entry can be out of sync between kernel and APPL_DB if multiple updates are received from RTNL ([#2619](sonic-net/sonic-swss#2619))
* a01470f Remove TODO comments that are no longer relevant ([#2622](sonic-net/sonic-swss#2622))
* d058390 Changed the BFD default detect multiplier to 10x ([#2614](sonic-net/sonic-swss#2614))
* d78b528 [MuxOrch] Enabling neighbor when adding in active state ([#2601](sonic-net/sonic-swss#2601))
* 4ebdad1 [routesync] Fix for stale dynamic neighbor ([#2553](sonic-net/sonic-swss#2553))
* 8857f92 Added new attributes for Vnet and Vxlan ecmp configurations. ([#2584](sonic-net/sonic-swss#2584))
* b6bbc3e Revert [voq][chassis]Add show fabric counters port/queue commands (2522) ([#2611](sonic-net/sonic-swss#2611))
* 52406e2 Add missing parameter to on_switch_shutdown_request method. ([#2567](sonic-net/sonic-swss#2567))
* 4ac9ad9 Increase diff coverage to 80% ([#2599](sonic-net/sonic-swss#2599))
* 8a0bb36 Handle Mac address 'none' ([#2593](sonic-net/sonic-swss#2593))
* f496ab3 [vstest] Only collect stdout of orchagent_restart_check in vstest ([#2597](sonic-net/sonic-swss#2597))
* 1dab495 Avoid aborting orchagent when setting TUNNEL attributes ([#2591](sonic-net/sonic-swss#2591))
* 4395cea Fix neighbor doesn't update all attribute ([#2577](sonic-net/sonic-swss#2577))

Signed-off-by: dprital <drorp@nvidia.com>
prsunny pushed a commit that referenced this pull request Feb 16, 2023
*Merge remote-tracking branch 'upstream/master' into dash (#2663)
* Modify coppmgr mergeConfig to support preserving copp tables through reboot. (#2548)
* Avoid aborting orchagent when setting TUNNEL attributes (#2591)
* Handle Mac address 'none' (#2593)
* Increase diff coverage to 80% (#2599)
* Add missing parameter to on_switch_shutdown_request method. (#2567)
* Add ZMQ based ProducerStateTable and CustomerStateTable.
* Revert "[voq][chassis]Add show fabric counters port/queue commands (#2522)" (#2611)
* Added new attributes for Vnet and Vxlan ecmp configurations. (#2584)
* added support for monitoring, primary and adv_prefix and overlay_dmac.
* [routesync] Fix for stale dynamic neighbor (#2553)
* [MuxOrch] Enabling neighbor when adding in active state (#2601)
* Changed the BFD default detect multiplier to 10x (#2614)
* Remove TODO comments that are no longer relevant (#2622)
* Fix issue #13341 ARP entry can be out of sync between kernel and APPL_DB if multiple updates are received from RTNL (#2619)
* [refactor]Refactoring sai handle status (#2621)
* Vxlan tunnel endpoint custom monitoring APPL DB table. (#2589)
* added support for monitoring, primary and adv_prefix. changed filter_mac to overlay_dmac
* Data Structures and code to write APP_DB VNET_MONITOR table entries for custom monitoring of Vxlan tunnel endpoints.
* [bfdorch] add local discriminator to state DB (#2629)
* [acl] Add new ACL key BTH_OPCODE and AETH_SYNDROME  (#2617)
* [voq][chassis] Remove created ports from the default vlan. (#2607)
* [EVPN]Handling race condition when remote VNI arrives before tunnel map entry (#2642)
*Added check in remote VNI add to ensure vxlan tunnel map is created before adding the remote end point.
* [test_mux] add sleep in test_NH (#2648)
* [autoneg]Fixing adv interface types to be set when AN is disabled (#2638)
* [hash]: Add UT infra. (#2660)
*Added UT infra for Generic Hash feature
*Aligned PBH tests with Generic Hash UT infra
* [sai_failure_dump]Invoking dump during SAI failure (#2644)
* [ResponsePublisher] add pipeline support  (#2511)
* [dash] Fix compilation issue caused by missing include.
@dprital
Copy link
Collaborator

dprital commented May 19, 2023

@vganesan-nokia , I notived that this PR is not included in 202211 branch so I added a label for it. @StormLiangMS , can you please cherry pick it to 202211 ?

StormLiangMS pushed a commit that referenced this pull request May 19, 2023
* [routesync] Fix for stale dynamic neighbor

The changes are for fixing stale neighbor in the ASIC_DB and data path
when eBGP neighbors are shutdown and neighbors are flushed. The problem
is described in issue: sonic-net/sonic-buildimage#12442
The root cause of this issue is due to not deleing the route from the
ASIC_DB when the route's next hop is on eth0 or docker0 interface. The solution is
to delete the route entry from ASIC_DB instead of just returning when the route's
next hop is on the interface eth0 or docker0


This commit fixes the warm restart unit test failure. When the route with
only nh on eth0 or docker0 is removed and if the route is the default
route, orchagent sends "add" black hole route to the syncd. So the ASIC
DB gets n hset message. When this happens during warm restart, the unit
test identifies this as unwanted setting and the unit test fails. To fix
this issues, the route delete is sent only if the warm restart is not in
progress. This is done following the same warm restart handling approach
used for route delete in other palces.

Signed-off-by: vedganes <veda.ganesan@nokia.com>
arfeigin added a commit to arfeigin/sonic-swss that referenced this pull request May 22, 2023
theasianpianist pushed a commit to theasianpianist/sonic-swss that referenced this pull request Jul 21, 2023
)

*Merge remote-tracking branch 'upstream/master' into dash (sonic-net#2663)
* Modify coppmgr mergeConfig to support preserving copp tables through reboot. (sonic-net#2548)
* Avoid aborting orchagent when setting TUNNEL attributes (sonic-net#2591)
* Handle Mac address 'none' (sonic-net#2593)
* Increase diff coverage to 80% (sonic-net#2599)
* Add missing parameter to on_switch_shutdown_request method. (sonic-net#2567)
* Add ZMQ based ProducerStateTable and CustomerStateTable.
* Revert "[voq][chassis]Add show fabric counters port/queue commands (sonic-net#2522)" (sonic-net#2611)
* Added new attributes for Vnet and Vxlan ecmp configurations. (sonic-net#2584)
* added support for monitoring, primary and adv_prefix and overlay_dmac.
* [routesync] Fix for stale dynamic neighbor (sonic-net#2553)
* [MuxOrch] Enabling neighbor when adding in active state (sonic-net#2601)
* Changed the BFD default detect multiplier to 10x (sonic-net#2614)
* Remove TODO comments that are no longer relevant (sonic-net#2622)
* Fix issue #13341 ARP entry can be out of sync between kernel and APPL_DB if multiple updates are received from RTNL (sonic-net#2619)
* [refactor]Refactoring sai handle status (sonic-net#2621)
* Vxlan tunnel endpoint custom monitoring APPL DB table. (sonic-net#2589)
* added support for monitoring, primary and adv_prefix. changed filter_mac to overlay_dmac
* Data Structures and code to write APP_DB VNET_MONITOR table entries for custom monitoring of Vxlan tunnel endpoints.
* [bfdorch] add local discriminator to state DB (sonic-net#2629)
* [acl] Add new ACL key BTH_OPCODE and AETH_SYNDROME  (sonic-net#2617)
* [voq][chassis] Remove created ports from the default vlan. (sonic-net#2607)
* [EVPN]Handling race condition when remote VNI arrives before tunnel map entry (sonic-net#2642)
*Added check in remote VNI add to ensure vxlan tunnel map is created before adding the remote end point.
* [test_mux] add sleep in test_NH (sonic-net#2648)
* [autoneg]Fixing adv interface types to be set when AN is disabled (sonic-net#2638)
* [hash]: Add UT infra. (sonic-net#2660)
*Added UT infra for Generic Hash feature
*Aligned PBH tests with Generic Hash UT infra
* [sai_failure_dump]Invoking dump during SAI failure (sonic-net#2644)
* [ResponsePublisher] add pipeline support  (sonic-net#2511)
* [dash] Fix compilation issue caused by missing include.
theasianpianist pushed a commit to theasianpianist/sonic-swss that referenced this pull request Jul 25, 2023
)

*Merge remote-tracking branch 'upstream/master' into dash (sonic-net#2663)
* Modify coppmgr mergeConfig to support preserving copp tables through reboot. (sonic-net#2548)
* Avoid aborting orchagent when setting TUNNEL attributes (sonic-net#2591)
* Handle Mac address 'none' (sonic-net#2593)
* Increase diff coverage to 80% (sonic-net#2599)
* Add missing parameter to on_switch_shutdown_request method. (sonic-net#2567)
* Add ZMQ based ProducerStateTable and CustomerStateTable.
* Revert "[voq][chassis]Add show fabric counters port/queue commands (sonic-net#2522)" (sonic-net#2611)
* Added new attributes for Vnet and Vxlan ecmp configurations. (sonic-net#2584)
* added support for monitoring, primary and adv_prefix and overlay_dmac.
* [routesync] Fix for stale dynamic neighbor (sonic-net#2553)
* [MuxOrch] Enabling neighbor when adding in active state (sonic-net#2601)
* Changed the BFD default detect multiplier to 10x (sonic-net#2614)
* Remove TODO comments that are no longer relevant (sonic-net#2622)
* Fix issue #13341 ARP entry can be out of sync between kernel and APPL_DB if multiple updates are received from RTNL (sonic-net#2619)
* [refactor]Refactoring sai handle status (sonic-net#2621)
* Vxlan tunnel endpoint custom monitoring APPL DB table. (sonic-net#2589)
* added support for monitoring, primary and adv_prefix. changed filter_mac to overlay_dmac
* Data Structures and code to write APP_DB VNET_MONITOR table entries for custom monitoring of Vxlan tunnel endpoints.
* [bfdorch] add local discriminator to state DB (sonic-net#2629)
* [acl] Add new ACL key BTH_OPCODE and AETH_SYNDROME  (sonic-net#2617)
* [voq][chassis] Remove created ports from the default vlan. (sonic-net#2607)
* [EVPN]Handling race condition when remote VNI arrives before tunnel map entry (sonic-net#2642)
*Added check in remote VNI add to ensure vxlan tunnel map is created before adding the remote end point.
* [test_mux] add sleep in test_NH (sonic-net#2648)
* [autoneg]Fixing adv interface types to be set when AN is disabled (sonic-net#2638)
* [hash]: Add UT infra. (sonic-net#2660)
*Added UT infra for Generic Hash feature
*Aligned PBH tests with Generic Hash UT infra
* [sai_failure_dump]Invoking dump during SAI failure (sonic-net#2644)
* [ResponsePublisher] add pipeline support  (sonic-net#2511)
* [dash] Fix compilation issue caused by missing include.
theasianpianist pushed a commit to theasianpianist/sonic-swss that referenced this pull request Jul 25, 2023
)

*Merge remote-tracking branch 'upstream/master' into dash (sonic-net#2663)
* Modify coppmgr mergeConfig to support preserving copp tables through reboot. (sonic-net#2548)
* Avoid aborting orchagent when setting TUNNEL attributes (sonic-net#2591)
* Handle Mac address 'none' (sonic-net#2593)
* Increase diff coverage to 80% (sonic-net#2599)
* Add missing parameter to on_switch_shutdown_request method. (sonic-net#2567)
* Add ZMQ based ProducerStateTable and CustomerStateTable.
* Revert "[voq][chassis]Add show fabric counters port/queue commands (sonic-net#2522)" (sonic-net#2611)
* Added new attributes for Vnet and Vxlan ecmp configurations. (sonic-net#2584)
* added support for monitoring, primary and adv_prefix and overlay_dmac.
* [routesync] Fix for stale dynamic neighbor (sonic-net#2553)
* [MuxOrch] Enabling neighbor when adding in active state (sonic-net#2601)
* Changed the BFD default detect multiplier to 10x (sonic-net#2614)
* Remove TODO comments that are no longer relevant (sonic-net#2622)
* Fix issue #13341 ARP entry can be out of sync between kernel and APPL_DB if multiple updates are received from RTNL (sonic-net#2619)
* [refactor]Refactoring sai handle status (sonic-net#2621)
* Vxlan tunnel endpoint custom monitoring APPL DB table. (sonic-net#2589)
* added support for monitoring, primary and adv_prefix. changed filter_mac to overlay_dmac
* Data Structures and code to write APP_DB VNET_MONITOR table entries for custom monitoring of Vxlan tunnel endpoints.
* [bfdorch] add local discriminator to state DB (sonic-net#2629)
* [acl] Add new ACL key BTH_OPCODE and AETH_SYNDROME  (sonic-net#2617)
* [voq][chassis] Remove created ports from the default vlan. (sonic-net#2607)
* [EVPN]Handling race condition when remote VNI arrives before tunnel map entry (sonic-net#2642)
*Added check in remote VNI add to ensure vxlan tunnel map is created before adding the remote end point.
* [test_mux] add sleep in test_NH (sonic-net#2648)
* [autoneg]Fixing adv interface types to be set when AN is disabled (sonic-net#2638)
* [hash]: Add UT infra. (sonic-net#2660)
*Added UT infra for Generic Hash feature
*Aligned PBH tests with Generic Hash UT infra
* [sai_failure_dump]Invoking dump during SAI failure (sonic-net#2644)
* [ResponsePublisher] add pipeline support  (sonic-net#2511)
* [dash] Fix compilation issue caused by missing include.
// In this case since we do not want the route with next hop on eth0/docker0, we return.
// But still we need to clear the route from the APPL_DB. Otherwise the APPL_DB and data
// path will be left with stale route entry
if(alsv.size() == 1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the context, it seems for del msg, should we check it by
if((nlmsg_type == RTM_DELROUTE) && (alsv.size() == 1))
?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vganesan-nokia , do you want to address the comment?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even it is a RTM_NEWROUTE on eth0/docker0, it may run into

if (!warmRestartInProgress)
{
...
m_routeTable.del(destipprefix)
}

then the route is deleted from APPL_DB and ASIC_DB ?
It happened on my real device with default route (0.0.0.0/0) is removed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vganesan-nokia , do you want to address the comment?

Yes. I'll make a PR to fix this.

vganesan-nokia added a commit to vganesan-nokia/sonic-swss that referenced this pull request Jan 5, 2024
Changes done to stale neighbor fix PR#2553 (sonic-net#2553)
Changes include adding check to delete the stale neighbor from ASIC_DB
and APPL_DB only if the kernel command is RTM_DELROUTE

Signed-off-by: vedganes <veda.ganesan@nokia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

10 participants