-
Notifications
You must be signed in to change notification settings - Fork 667
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Backup STATE_DB PORT_TABLE|Ethernet during warm-reboot #3111
Backup STATE_DB PORT_TABLE|Ethernet during warm-reboot #3111
Conversation
Signed-off-by: Mihir Patel <patelmi@microsoft.com>
…eting unwanted fields during warm-reboot
if not string.match(k, 'FDB_TABLE|') and not string.match(k, 'WARM_RESTART_TABLE|') \ | ||
if string.match(k, 'PORT_TABLE|Ethernet') then | ||
for i, f in ipairs(redis.call('hgetall', k)) do | ||
if i % 2 == 1 then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not clear on what this check looking for - if i % 2 == 1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vaibhavhd - This logic will help in selecting the field from FieldValue Pair.
Since the getall
command will return a flat list of field followed by its corresponding value, the above logic will help in selecting the field and delete the corresponding FieldValue Pair
Below snippet has a total of 11 FieldValue pairs.
root@sonic:/home/admin# redis-cli -n 6 hgetall "PORT_TABLE|Ethernet0"
1) "state"
2) "ok"
3) "netdev_oper_status"
4) "up"
5) "admin_status"
6) "up"
7) "mtu"
8) "9100"
9) "CMIS_REINIT_REQUIRED"
10) "false"
11) "NPU_SI_SETTINGS_SYNC_STATUS"
12) "NPU_SI_SETTINGS_DEFAULT"
13) "supported_speeds"
14) "40000,100000"
15) "supported_fecs"
16) "none,rs"
17) "host_tx_ready"
18) "true"
19) "speed"
20) "100000"
21) "fec"
22) "N/A"
root@sonic:/home/admin#
@@ -247,7 +247,17 @@ function backup_database() | |||
# Delete keys in stateDB except FDB_TABLE|*, MIRROR_SESSION_TABLE|*, WARM_RESTART_ENABLE_TABLE|*, FG_ROUTE_TABLE|* | |||
sonic-db-cli STATE_DB eval " | |||
for _, k in ipairs(redis.call('keys', '*')) do | |||
if not string.match(k, 'FDB_TABLE|') and not string.match(k, 'WARM_RESTART_TABLE|') \ | |||
if string.match(k, 'PORT_TABLE|Ethernet') then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zjswhhh do you think keeping port_table intact in state-db will cause any change in mux init logic for dualtor warmboot?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No it shouldn't.
@@ -247,7 +247,17 @@ function backup_database() | |||
# Delete keys in stateDB except FDB_TABLE|*, MIRROR_SESSION_TABLE|*, WARM_RESTART_ENABLE_TABLE|*, FG_ROUTE_TABLE|* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please update the comment accordingly.
if not string.match(f, 'host_tx_ready') \ | ||
and not string.match(f, 'NPU_SI_SETTINGS_SYNC_STATUS') \ | ||
and not string.match(f, 'CMIS_REINIT_REQUIRED') then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- From the PR's description it sounds like you need only
host_tx_ready
field. What is the reason for also persisting NPU and CMIS fields? Please add the reasoning as a comment here too. - Have you considered cross branch and in-branch warm-reboots where the 3 fields might get modified (deleted, name change, etc?) in the target image? How will the target image handle a scenario when state db has some filed which is not supported. This question arises from the fact that you are not preserving entire table, but just 3 fields from it. Existing cases preserved entire table.
To prevent us from getting into any unlikely scenarios in # 2 above, this handling can be done by the target image's portorch. In other words, when the device boots into target image, the portsorch initializes these fields afresh after checking that system is undergoing warmboot by checking system warm-restart flag. The downside in this approach is that portorch will have no idea what these fields were set to in the base image.
The argument is same for other fields such as OPER state (netdev_oper_status) and speeds. In the recovery path of warmboot these fields are set afresh based on SAI get calls (I think)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vaibhavhd you may want to review this https://github.com/sonic-net/sonic-utilities/pull/3240/files
Just curious, what was the reason to not back up the entire table? Is it because some of the fields (e.g. |
@longhuan-cisco - Yes, you are correct. Hence, we decided to preserve selected fields which xcvrd/OA cares about and delete other fields from STATE_DB. |
As discussed, I tested the change from this PR, @mihirpat1 @prgeor Could you please continue on this PR for the remaining?
|
@StormLiangMS @yxieca @bingwang-ms please cherry pick this to 202311. Need for warm reboot support for platforms using CMIS optics |
* Backup STATE_DB PORT_TABLE during warm-reboot Signed-off-by: Mihir Patel <patelmi@microsoft.com> * Backing up selected fields from STATE_DB PORT_TABLE|Ethernet* and deleting unwanted fields during warm-reboot --------- Signed-off-by: Mihir Patel <patelmi@microsoft.com>
Cherry-pick PR to 202311: #3352 |
* Backup STATE_DB PORT_TABLE during warm-reboot Signed-off-by: Mihir Patel <patelmi@microsoft.com> * Backing up selected fields from STATE_DB PORT_TABLE|Ethernet* and deleting unwanted fields during warm-reboot --------- Signed-off-by: Mihir Patel <patelmi@microsoft.com>
* Backup STATE_DB PORT_TABLE during warm-reboot Signed-off-by: Mihir Patel <patelmi@microsoft.com> * Backing up selected fields from STATE_DB PORT_TABLE|Ethernet* and deleting unwanted fields during warm-reboot --------- Signed-off-by: Mihir Patel <patelmi@microsoft.com>
@bingwang-ms we need this in 202405 |
@prgeor Seems there is cherry-pick conflict. Please double check |
@bingwang-ms I have removed the 202405 tags since this is already part of 202405. |
What I did
Currently, entire PORT_TABLE in STATE_DB is being deleted during warm-reboot. Due to this,
host_tx_ready
changes to false after warm-reboot which causes the link to remain down.How I did it
Backing up
host_tx_ready
,NPU_SI_SETTINGS_SYNC_STATUS
andCMIS_REINIT_REQUIRED
fields from `STATE_DB PORT_TABLE* during warm-reboot now.How to verify it
Verified that host_tx_ready in STATE_DB PORT_TABLE is retained after warm-reboot and the link remains up. Also, ensured that the keys CMIS_REINIT_REQUIRED and NPU_SI_SETTINGS_SYNC_STATUS are retained after warm-reboot.
Before warm-reboot
After warm-reboot script backs up PORT_TABLE and deletes unwanted fields
After switch boot-up post warm-reboot
Previous command output (if the output of a command-line utility has changed)
New command output (if the output of a command-line utility has changed)