Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update SonicDBConfig calls for centralize_database to use C++ APIs #1441

Merged
merged 1 commit into from
Feb 22, 2021

Conversation

vaibhavhd
Copy link
Contributor

What I did

Fixes: sonic-net/sonic-buildimage#6811

The DB backup during warmboot has started failing recently after the changes made in commit (2e1f354#diff-5777288143b5a8ff1e97be074d748fd825f0ee1812f2d7c65d850c893bccefd6)

Above change was an attempt to deprecate the SonicDBConfig methods originally implemented by python in https://github.com/Azure/sonic-py-swsssdk/. The new implementation is based on hiredis C++ library.

How I did it

The centralize_database script still uses the Python APIs instead of C++, update the method names which are now defined in sonic-swss-common https://github.com/Azure/sonic-swss-common/blob/master/common/dbconnector.h#L64

How to verify it

With the changes, the warmboot now works without an error:

admin@str-s6100-acs-2:~$ sudo warm-reboot -vvv
Thu 18 Feb 2021 08:09:05 AM UTC Saving counters folder before warmboot...
Thu 18 Feb 2021 08:09:08 AM UTC Pausing orchagent ...
Thu 18 Feb 2021 08:09:08 AM UTC Collecting logs to check ssd health before warm-reboot...
Thu 18 Feb 2021 08:09:08 AM UTC Stopping nat ...
Dumping conntrack entries failed
Error response from daemon: Cannot kill container: nat: No such container: nat
Thu 18 Feb 2021 08:09:09 AM UTC Stopped nat ...
Thu 18 Feb 2021 08:09:09 AM UTC Stopping radv service...
Thu 18 Feb 2021 08:09:09 AM UTC Stopped radv service...
Thu 18 Feb 2021 08:09:09 AM UTC Stopping bgp ...
Thu 18 Feb 2021 08:09:15 AM UTC Stopped bgp ...
Thu 18 Feb 2021 08:09:15 AM UTC Stopping sflow ...
Thu 18 Feb 2021 08:09:15 AM UTC Stopped sflow ...
Thu 18 Feb 2021 08:09:18 AM UTC Stopping swss service ...
Thu 18 Feb 2021 08:09:19 AM UTC Stopped swss service ...
Thu 18 Feb 2021 08:09:19 AM UTC Initialize pre-shutdown ...
Thu 18 Feb 2021 08:09:19 AM UTC Requesting pre-shutdown ...
Thu 18 Feb 2021 08:09:20 AM UTC Waiting for pre-shutdown ...
Thu 18 Feb 2021 08:09:29 AM UTC Pre-shutdown succeeded ...
Thu 18 Feb 2021 08:09:29 AM UTC Backing up database ...
Thu 18 Feb 2021 08:09:30 AM UTC Stopping teamd ...
Thu 18 Feb 2021 08:09:37 AM UTC Stopped teamd ...
Thu 18 Feb 2021 08:09:37 AM UTC Stopping syncd ...
Thu 18 Feb 2021 08:09:49 AM UTC Stopped syncd ...
Thu 18 Feb 2021 08:09:49 AM UTC Stopping all remaining containers ...
Thu 18 Feb 2021 08:09:51 AM UTC Stopped all remaining containers ...
Thu 18 Feb 2021 08:09:53 AM UTC updating ssd fw forwarm-reboot
Thu 18 Feb 2021 08:09:53 AM UTC Enabling Watchdog before warm-reboot
Watchdog armed for 180 seconds
Thu 18 Feb 2021 08:09:54 AM UTC Running x86_64-dell_s6100_c2538-r0 specific plugin...
Thu 18 Feb 2021 08:09:54 AM UTC Rebooting with /sbin/kexec -e to SONiC-OS-HEAD.399-7790a74d ...

Also, the test_warm_reboot now passes on the devices:

----------------------------------------------------------- live log call ------------------------------------------------------------
08:26:28 INFO test_reboot.py:reboot_and_check:58: Run warm reboot on DUT
08:26:32 INFO reboot.py:reboot:124: waiting for ssh to drop
08:26:32 INFO reboot.py:execute_reboot_command:108: rebooting with command "warm-reboot"
08:27:30 INFO reboot.py:reboot:145: waiting for ssh to startup
08:27:40 INFO reboot.py:reboot:156: ssh has started up
08:27:40 INFO reboot.py:reboot:158: waiting for switch to initialize
08:27:40 INFO reboot.py:reboot:161: waiting for warmboot-finalizer service to become activating
08:27:53 INFO reboot.py:reboot:170: waiting for warmboot-finalizer service to finish
08:27:54 INFO reboot.py:reboot:172: warmboot finalizer service state activating
08:27:56 INFO reboot.py:reboot:176: warmboot finalizer service state activating
08:32:07 INFO reboot.py:reboot:176: warmboot finalizer service state activating
08:32:17 INFO reboot.py:reboot:176: warmboot finalizer service state activating
08:32:28 INFO reboot.py:reboot:176: warmboot finalizer service state activating
08:32:39 INFO reboot.py:reboot:176: warmboot finalizer service state activating
08:32:49 INFO reboot.py:reboot:176: warmboot finalizer service state activating
08:32:59 INFO reboot.py:reboot:176: warmboot finalizer service state activating
08:33:10 INFO reboot.py:reboot:176: warmboot finalizer service state inactive
08:33:20 INFO reboot.py:reboot:181: warmboot-finalizer service finished
08:33:20 INFO reboot.py:reboot:186: warm reboot finished
08:33:20 INFO reboot.py:reboot:189: DUT up since 2021-02-18 08:27:28
08:33:20 INFO test_reboot.py:check_interfaces_and_services:72: Wait until all critical services are fully started
08:33:20 INFO processes_utils.py:wait_critical_processes:46: Wait until all critical processes are healthy
08:33:20 INFO processes_utils.py:_all_critical_processes_healthy:22: Check critical processes status
08:33:49 INFO test_reboot.py:check_interfaces_and_services:76: Check reboot cause
08:33:49 INFO reboot.py:get_reboot_cause:198: Getting reboot cause from dut str-s6100-acs-2
08:33:51 INFO test_reboot.py:check_interfaces_and_services:84: Wait 300 seconds for all the transceivers to be detected
08:33:51 INFO interface_utils.py:get_port_map:117: Retrieving port mapping from DUT
08:33:54 INFO interface_utils.py:check_interface_status:44: Check interface status using cmd 'show interface'
08:36:11 INFO interface_utils.py:check_interface_status:77: Check interface status using the interface_facts module
08:36:13 INFO test_reboot.py:check_interfaces_and_services:89: Check transceiver status
08:36:13 INFO interface_utils.py:get_port_map:117: Retrieving port mapping from DUT
08:36:15 INFO transceiver_utils.py:check_transceiver_basic:61: Check whether transceiver information of all ports are in redis
08:36:16 INFO test_reboot.py:check_interfaces_and_services:96: Check pmon daemon status
08:36:17 INFO devices.py:get_pmon_daemon_states:636: Pmon daemon state list for this platform is {u'syseepromd': u'RUNNING', u'xcvrd': u'RUNNING', u'thermalctld': u'RUNNING', u'psud': u'RUNNING', u'supervisor-proc-exit-listener': u'RUNNING', u'pcied': u'RUNNING'}
PASSED                    

Previous command output (if the output of a command-line utility has changed)

New command output (if the output of a command-line utility has changed)

@vaibhavhd
Copy link
Contributor Author

Retest this please

@@ -6,19 +6,19 @@ import redis
import argparse

def centralize_to_target_db(target_dbname):
target_dbport = SonicDBConfig.get_port(target_dbname)
target_dbhost = SonicDBConfig.get_hostname(target_dbname)
target_dbport = SonicDBConfig.getDbPort(target_dbname)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getDbPort [](start = 34, length = 9)

Thanks for the fixes! They are valid. There is another way to use swig to translate the names by a single wrapper like https://github.com/Azure/sonic-swss-common/blob/9e91e0d891398b468b8087682ae91335791fac51/common/dbconnector.h#L76.

If the functions are used in multiple repos, the renaming will be better.

Copy link
Contributor Author

@vaibhavhd vaibhavhd Feb 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, thanks for the info!
I was thinking that we are planning to deprecate the Python APIs, and want to stick to C++ library.
If that is correct understanding, then shouldn't we move away from using SWIG wrappers?

If the functions are used in multiple repos, the renaming will be better.

In this case, we should have similar change as in this PR for other repos too. Basically, to maintain C++ method names, and less of translated Python methods.

Thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a trade-off: Adding SWIG wrappers means more code to maintain, but allows us to align the naming convention with PEP8, where function/method names should be lowercase_with_underscores, whereas in C++ our function names are camel case.

@vaibhavhd
Copy link
Contributor Author

@qiluo-msft, the failure seen in the tests seems to be unrealted, and has failed all the recent builds in https://sonic-jenkins.westus2.cloudapp.azure.com/job/common/job/sonic-utilities-build-pr

I have started a separate email thread for the failure.

Can we go ahead and merge this change?

@qiluo-msft
Copy link
Contributor

We need to wait the build fix first, and merge this one after all build checkers pass.

@vaibhavhd
Copy link
Contributor Author

Retest this please

1 similar comment
@vaibhavhd
Copy link
Contributor Author

Retest this please

@vaibhavhd vaibhavhd merged commit d79d189 into sonic-net:master Feb 22, 2021
@vaibhavhd vaibhavhd deleted the warmboot-fix branch February 22, 2021 16:41
anand-kumar-subramanian pushed a commit to anand-kumar-subramanian/sonic-utilities that referenced this pull request Mar 2, 2021
The DB backup during warmboot has started failing recently after the changes made to deprecate the usage of SonicDBConfig methods originally implemented by python in https://github.com/Azure/sonic-py-swsssdk/. The new implementation is based on hiredis C++ library.

How I did it: The centralize_database script still uses the Python APIs instead of C++, update the method names which are now defined in sonic-swss-common.
With the new changes, the warm-boot goes ahead without DB save errors.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Backing up database fails during warmboot
3 participants