Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restore neighbor table to kernel during system warm-reboot #2213

Merged
merged 3 commits into from
Nov 10, 2018

Conversation

zhenggen-xu
Copy link
Collaborator

Restore neighbor table to kernel during system warm-reboot

Added a service: "restore_neighbors" to restore neighbor table into
kernel during system warm reboot. The service is started by supervisord
in swss docker when the docker is started.

In case system warm reboot is enabled, it will try to restore the neighbor
table from appDB into kernel through netlink API calls and update the neighbor
table by sending arp/ns requests to all neighbor entries, then it sets the
stateDB flag for neighsyncd to continue the reconciliation process.

-- Added tcpdump python-scapy debian package into orchagent and vs dockers.
-- Added python module: pyroute2 netifaces into orchagent and vc dockers.
-- Workarounded tcpdump issue in the vs docker

Signed-off-by: Zhenggen Xu zxu@linkedin.com

- What I did
Restore neighbor table to kernel during system warm-reboot

- How I did it
Added a service: "restore_neighbors" to restore neighbor table into
kernel during system warm reboot. The service is started by supervisord
in swss docker when the docker is started.

In case system warm reboot is enabled, it will try to restore the neighbor
table from appDB into kernel through netlink API calls and update the neighbor
table by sending arp/ns requests to all neighbor entries, then it sets the
stateDB flag for neighsyncd to continue the reconciliation process.

-- Added tcpdump python-scapy debian package into orchagent and vs dockers.
-- Added python module: pyroute2 netifaces into orchagent and vc dockers.
-- Workarounded tcpdump issue in the vs docker

- How to verify it
1, vs test cases in sonic-swss
2, load the image and tried the restore_neighbors service manually, and it worked.

- Description for the changelog

- A picture of a cute animal (not mandatory but encouraged)

files/scripts/restore_neighbors.py Outdated Show resolved Hide resolved
files/scripts/restore_neighbors.py Outdated Show resolved Hide resolved
files/scripts/restore_neighbors.py Outdated Show resolved Hide resolved
files/scripts/restore_neighbors.py Outdated Show resolved Hide resolved
files/scripts/restore_neighbors.py Outdated Show resolved Hide resolved
files/scripts/restore_neighbors.py Outdated Show resolved Hide resolved
files/scripts/restore_neighbors.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@nikos-github nikos-github left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain the reason why we would want to restore neighbor table to the kernel in a warm-restart?

@zhenggen-xu
Copy link
Collaborator Author

@nikos-github This is to support whole system warm reboot case, where we will reboot the kernel, and then the neighbor entries would be gone in kernel. We need restore them into kernel before neighsyncd kick off the reconciliation logic.
In case we don't do this, we may lose the active neighbor entries in appDB and ASICDB thus impact the traffic.

@lguohan
Copy link
Collaborator

lguohan commented Nov 1, 2018

I think we should put restore_neighbors.py in the sonic-swss repo and let debian to install it. in terms of function, it is part of neigborsyncd.

Copy link
Collaborator

@lguohan lguohan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as comments

@jipanyang
Copy link
Collaborator

jipanyang commented Nov 1, 2018

If the arp request/NS packet is being built for each neighbor and the neighbor is supposed to reply, in that case kernel will have up to date neighbor data.

Injecting old data into kernel seems unnecessary and causing problem here(stale entries) ?

Copy link
Collaborator

@nikos-github nikos-github left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zhenggen-xu How are you guaranteeing that the entry you are pushing in, is not stale and not overriding a valid/newer one?

Copy link
Collaborator

@nikos-github nikos-github left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a more correct approach to consider is to arp for the neighbors in the DB which are not in the kernel already. You can compare with what you have with what arp puts into the kernel and for anything that you have in the DB but not in the kernel, you can insert/complement and let it age out. Any entry in the kernel that's also in the DB, is guaranteed to be valid and newer against what the DB has.

@zhenggen-xu
Copy link
Collaborator Author

If the arp request/NS packet is being built for each neighbor and the neighbor is supposed to reply, in that case kernel will have up to date neighbor data.

Injecting old data into kernel seems unnecessary and causing problem here(stale entries) ?

There were performance issues where we need wait for the neighbors to reply. So the approach here is that we will keep all updated neighbor entry, and still insert the entries that is not updated to make sure we don't remove them from HW due to the un-replied messages and cause traffic lose, those entries will be expired based on Linux timer.

@zhenggen-xu
Copy link
Collaborator Author

zhenggen-xu commented Nov 6, 2018

I think a more correct approach to consider is to arp for the neighbors in the DB which are not in the kernel already. You can compare with what you have with what arp puts into the kernel and for anything that you have in the DB but not in the kernel, you can insert/complement and let it age out. Any entry in the kernel that's also in the DB, is guaranteed to be valid and newer against what the DB has.

@nikos-github To arp for all neighbors in DB should not cause problems as long as we don't overwrite the new entries with old state. The code is updated so we would not write the old entry to kernel if exists in kernel. Code is moving to sonic-net/sonic-swss#661

@jipanyang
Copy link
Collaborator

There were performance issues where we need wait for the neighbors to reply. So the approach here is that we will keep all updated neighbor entry, and still insert the entries that is not updated to make sure we don't remove them from HW due to the un-replied messages and cause traffic lose, those entries will be expired based on Linux timer.

Could you share the performance data? I thought with scapy replacing arping the performance is not problem any more.

@zhenggen-xu
Copy link
Collaborator Author

I think we should put restore_neighbors.py in the sonic-swss repo and let debian to install it. in terms of function, it is part of neigborsyncd.

Done

@zhenggen-xu
Copy link
Collaborator Author

There were performance issues where we need wait for the neighbors to reply. So the approach here is that we will keep all updated neighbor entry, and still insert the entries that is not updated to make sure we don't remove them from HW due to the un-replied messages and cause traffic lose, those entries will be expired based on Linux timer.

Could you share the performance data? I thought with scapy replacing arping the performance is not problem any more.

The performance is related to receive side after using scapy. If arp/nd entries didn't get back in time, it is not safe to assume it is gone, we may have to repeat to send arp/ns packets and check responses. Anyway, I think we can document this as an improvement bug, we can tackle it later.

Added a service: "restore_neighbors" to restore neighbor table into
kernel during system warm reboot. The service is started by supervisord
in swss docker when the docker is started.

In case system warm reboot is enabled, it will try to restore the neighbor
table from appDB into kernel through netlink API calls and update the neighbor
table by sending arp/ns requests to all neighbor entries, then it sets the
stateDB flag for neighsyncd to continue the reconciliation process.

-- Added tcpdump python-scapy debian package into orchagent and vs dockers.
-- Added python module: pyroute2 netifaces into orchagent and vc dockers.
-- Workarounded tcpdump issue in the vs docker

Signed-off-by: Zhenggen Xu <zxu@linkedin.com>
Made changes to makefiles accordingly

Make dockerfile.j2 changes and supervisord config changes

Add python monotonic lib for time access

Signed-off-by: Zhenggen Xu <zxu@linkedin.com>
Signed-off-by: Zhenggen Xu <zxu@linkedin.com>
@lguohan lguohan merged commit 51a7661 into sonic-net:master Nov 10, 2018
@zhenggen-xu zhenggen-xu deleted the restore-neigh branch June 7, 2019 19:32
yxieca added a commit to yxieca/sonic-buildimage that referenced this pull request Jul 28, 2022
…rm-common] advance submodule head

linkmgrd:
* e0fe1d4 2022-07-27 | TSA enhancement (sonic-net#98) (HEAD -> 202205) [Jing Zhang]

utilities:
* 7d7e15e 2022-07-18 | [vnet_route_check] Align DB data parse logic with format used by swsscommon API (sonic-net#2268) (HEAD -> 202205) [Volodymyr Samotiy]
* b3d5d18 2022-07-20 | [MultiAsic] sudo reboot command doesn't gracefully stop Asic syncd# on multiasic platform (sonic-net#2258) [Marty Y. Lok]
* 504ebe6 2022-07-08 | Add 'traffic_shift_away' option to config load_minigraph (sonic-net#2240) [tjchadaga]
* 4079e4a 2022-06-20 | Gives cisco-8000 more flexibility to easily add subcommnads under show platform (sonic-net#2213) [Nathan Cohen]
* 46443c6 2022-07-27 | Update db_migrator to support `PORT_QOS_MAP|global` (sonic-net#2205) [bingwang-ms]
* d7fbdd6 2022-05-26 | fix for non-coherent cmis modules (sonic-net#2163) [qinchuanares]
* 79b4439 2022-06-24 | [sfpshow/sfputil] Enhance sfpshow and sfputil to behavior correctly on RJ45 ports (sonic-net#2111) [Kebo Liu]

swss:
* 275f311 2022-07-26 | [DualToR] Handle race condition between tunnel_decap and mux orchestrator (sonic-net#2397) (HEAD -> 202205) [Devesh Pathak]
* 47586e8 2022-07-22 | [EVPN]Fix missing Vlan member update notification in P2MP scenario (sonic-net#2388) [Sudharsan Dhamal Gopalarathnam]
* 7d5c73f 2022-07-19 | [macsecmgr]: Fix cleanup macsec objs if container stop (sonic-net#2376) [Ze Gan]
* c03996f 2022-07-17 | [orchagent]: Enhance initSaiPhyApi (sonic-net#2367) [andywongarista]
* 57890d7 2022-07-27 | Fix for remote system interface not getting created (sonic-net#2364) [skeesara-nokia]
* 1a93ec9 2022-07-13 | Orchagent changes for synchronizing npu/phy device Tx in the data path before enabling transceiver<CMIS compliant> Tx. (sonic-net#2277) [jaganbal-a]

sairedis:
* a4903be 2022-07-20 | Update PN with XPN support (sonic-net#1081) (HEAD -> 202205, github/202205) [Ze Gan]
* 2cb5671 2022-07-27 | Add SAI_OBJECT_TYPE_TUNNEL object to break-before-make list (sonic-net#1075) [Vaibhav Hemant Dixit]

platform-daemon:
* 901c6a1 2022-06-28 | [CMIS]Improved 400G link bring up sequence (sonic-net#254) (HEAD -> 202205) [Prince George]

platform-common:
* f223b3f 2022-07-09 | Support get_port_or_cage_type (sonic-net#288) (HEAD -> 202205) [Stephen Sun]

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
yxieca added a commit that referenced this pull request Jul 29, 2022
…rm-common] advance submodule head (#11578)

linkmgrd:
* e0fe1d4 2022-07-27 | TSA enhancement (#98) (HEAD -> 202205) [Jing Zhang]

utilities:
* 7d7e15e 2022-07-18 | [vnet_route_check] Align DB data parse logic with format used by swsscommon API (#2268) (HEAD -> 202205) [Volodymyr Samotiy]
* b3d5d18 2022-07-20 | [MultiAsic] sudo reboot command doesn't gracefully stop Asic syncd# on multiasic platform (#2258) [Marty Y. Lok]
* 504ebe6 2022-07-08 | Add 'traffic_shift_away' option to config load_minigraph (#2240) [tjchadaga]
* 4079e4a 2022-06-20 | Gives cisco-8000 more flexibility to easily add subcommnads under show platform (#2213) [Nathan Cohen]
* 46443c6 2022-07-27 | Update db_migrator to support `PORT_QOS_MAP|global` (#2205) [bingwang-ms]
* d7fbdd6 2022-05-26 | fix for non-coherent cmis modules (#2163) [qinchuanares]
* 79b4439 2022-06-24 | [sfpshow/sfputil] Enhance sfpshow and sfputil to behavior correctly on RJ45 ports (#2111) [Kebo Liu]

swss:
* 275f311 2022-07-26 | [DualToR] Handle race condition between tunnel_decap and mux orchestrator (#2397) (HEAD -> 202205) [Devesh Pathak]
* 47586e8 2022-07-22 | [EVPN]Fix missing Vlan member update notification in P2MP scenario (#2388) [Sudharsan Dhamal Gopalarathnam]
* 7d5c73f 2022-07-19 | [macsecmgr]: Fix cleanup macsec objs if container stop (#2376) [Ze Gan]
* c03996f 2022-07-17 | [orchagent]: Enhance initSaiPhyApi (#2367) [andywongarista]
* 57890d7 2022-07-27 | Fix for remote system interface not getting created (#2364) [skeesara-nokia]
* 1a93ec9 2022-07-13 | Orchagent changes for synchronizing npu/phy device Tx in the data path before enabling transceiver<CMIS compliant> Tx. (#2277) [jaganbal-a]

sairedis:
* a4903be 2022-07-20 | Update PN with XPN support (#1081) (HEAD -> 202205, github/202205) [Ze Gan]
* 2cb5671 2022-07-27 | Add SAI_OBJECT_TYPE_TUNNEL object to break-before-make list (#1075) [Vaibhav Hemant Dixit]

platform-daemon:
* 901c6a1 2022-06-28 | [CMIS]Improved 400G link bring up sequence (#254) (HEAD -> 202205) [Prince George]

platform-common:
* f223b3f 2022-07-09 | Support get_port_or_cage_type (#288) (HEAD -> 202205) [Stephen Sun]

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
skbarista pushed a commit to skbarista/sonic-buildimage that referenced this pull request Aug 17, 2022
…rm-common] advance submodule head (sonic-net#11578)

linkmgrd:
* e0fe1d4 2022-07-27 | TSA enhancement (sonic-net#98) (HEAD -> 202205) [Jing Zhang]

utilities:
* 7d7e15e 2022-07-18 | [vnet_route_check] Align DB data parse logic with format used by swsscommon API (sonic-net#2268) (HEAD -> 202205) [Volodymyr Samotiy]
* b3d5d18 2022-07-20 | [MultiAsic] sudo reboot command doesn't gracefully stop Asic syncd# on multiasic platform (sonic-net#2258) [Marty Y. Lok]
* 504ebe6 2022-07-08 | Add 'traffic_shift_away' option to config load_minigraph (sonic-net#2240) [tjchadaga]
* 4079e4a 2022-06-20 | Gives cisco-8000 more flexibility to easily add subcommnads under show platform (sonic-net#2213) [Nathan Cohen]
* 46443c6 2022-07-27 | Update db_migrator to support `PORT_QOS_MAP|global` (sonic-net#2205) [bingwang-ms]
* d7fbdd6 2022-05-26 | fix for non-coherent cmis modules (sonic-net#2163) [qinchuanares]
* 79b4439 2022-06-24 | [sfpshow/sfputil] Enhance sfpshow and sfputil to behavior correctly on RJ45 ports (sonic-net#2111) [Kebo Liu]

swss:
* 275f311 2022-07-26 | [DualToR] Handle race condition between tunnel_decap and mux orchestrator (sonic-net#2397) (HEAD -> 202205) [Devesh Pathak]
* 47586e8 2022-07-22 | [EVPN]Fix missing Vlan member update notification in P2MP scenario (sonic-net#2388) [Sudharsan Dhamal Gopalarathnam]
* 7d5c73f 2022-07-19 | [macsecmgr]: Fix cleanup macsec objs if container stop (sonic-net#2376) [Ze Gan]
* c03996f 2022-07-17 | [orchagent]: Enhance initSaiPhyApi (sonic-net#2367) [andywongarista]
* 57890d7 2022-07-27 | Fix for remote system interface not getting created (sonic-net#2364) [skeesara-nokia]
* 1a93ec9 2022-07-13 | Orchagent changes for synchronizing npu/phy device Tx in the data path before enabling transceiver<CMIS compliant> Tx. (sonic-net#2277) [jaganbal-a]

sairedis:
* a4903be 2022-07-20 | Update PN with XPN support (sonic-net#1081) (HEAD -> 202205, github/202205) [Ze Gan]
* 2cb5671 2022-07-27 | Add SAI_OBJECT_TYPE_TUNNEL object to break-before-make list (sonic-net#1075) [Vaibhav Hemant Dixit]

platform-daemon:
* 901c6a1 2022-06-28 | [CMIS]Improved 400G link bring up sequence (sonic-net#254) (HEAD -> 202205) [Prince George]

platform-common:
* f223b3f 2022-07-09 | Support get_port_or_cage_type (sonic-net#288) (HEAD -> 202205) [Stephen Sun]

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants