Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FeatureD] Move the Feature related config from Hostcfgd into a new daemon #71

Merged
merged 6 commits into from
Jul 28, 2023

Conversation

vivekrnv
Copy link
Contributor

Description

Move feature related config from hostcfgd into a new daemon

tests/featured/featured_test.py::TestFeatureHandler::test_feature_config_parsing PASSED [ 46%]
tests/featured/featured_test.py::TestFeatureHandler::test_feature_config_parsing_defaults PASSED [ 47%]
tests/featured/featured_test.py::TestFeatureHandler::test_feature_resync PASSED [ 48%]
tests/featured/featured_test.py::TestFeatureHandler::test_handler_00_DualTorCase <- ../../../usr/local/lib/python3.9/dist-packages/pyfakefs/fake_filesystem_unittest.py PASSED [ 49%]
tests/featured/featured_test.py::TestFeatureHandler::test_handler_01_SingleToRCase <- ../../../usr/local/lib/python3.9/dist-packages/pyfakefs/fake_filesystem_unittest.py PASSED [ 50%]
tests/featured/featured_test.py::TestFeatureHandler::test_handler_02_T1Case <- ../../../usr/local/lib/python3.9/dist-packages/pyfakefs/fake_filesystem_unittest.py PASSED [ 50%]
tests/featured/featured_test.py::TestFeatureHandler::test_handler_03_SingleToRCase_DHCP_Relay_Enabled <- ../../../usr/local/lib/python3.9/dist-packages/pyfakefs/fake_filesystem_unittest.py PASSED [ 51%]
tests/featured/featured_test.py::TestFeatureHandler::test_handler_04_DualTorCaseWithNoSystemCalls <- ../../../usr/local/lib/python3.9/dist-packages/pyfakefs/fake_filesystem_unittest.py PASSED [ 52%]
tests/featured/featured_test.py::TestFeatureHandler::test_handler_05_Chassis_Supervisor_PACKET <- ../../../usr/local/lib/python3.9/dist-packages/pyfakefs/fake_filesystem_unittest.py PASSED [ 53%]
tests/featured/featured_test.py::TestFeatureHandler::test_handler_06_Chassis_Supervisor_VOQ <- ../../../usr/local/lib/python3.9/dist-packages/pyfakefs/fake_filesystem_unittest.py PASSED [ 54%]
tests/featured/featured_test.py::TestFeatureHandler::test_handler_07_Chassis_LineCard_VOQ <- ../../../usr/local/lib/python3.9/dist-packages/pyfakefs/fake_filesystem_unittest.py PASSED [ 55%]
tests/featured/featured_test.py::TestFeatureHandler::test_handler_08_Chassis_LineCard_Packet <- ../../../usr/local/lib/python3.9/dist-packages/pyfakefs/fake_filesystem_unittest.py PASSED [ 56%]
tests/featured/featured_test.py::TestFeatureHandler::test_handler_09_Chassis_Supervisor_PACKET_multinpu <- ../../../usr/local/lib/python3.9/dist-packages/pyfakefs/fake_filesystem_unittest.py PASSED [ 56%]
tests/featured/featured_test.py::TestFeatureHandler::test_handler_10_Chassis_LineCard_VOQ_multinpu <- ../../../usr/local/lib/python3.9/dist-packages/pyfakefs/fake_filesystem_unittest.py PASSED [ 57%]
tests/featured/featured_test.py::TestFeatureHandler::test_sync_state_field_00_DualTorCase <- ../../../usr/local/lib/python3.9/dist-packages/pyfakefs/fake_filesystem_unittest.py PASSED [ 58%]
tests/featured/featured_test.py::TestFeatureHandler::test_sync_state_field_01_SingleToRCase <- ../../../usr/local/lib/python3.9/dist-packages/pyfakefs/fake_filesystem_unittest.py PASSED [ 59%]
tests/featured/featured_test.py::TestFeatureHandler::test_sync_state_field_02_T1Case <- ../../../usr/local/lib/python3.9/dist-packages/pyfakefs/fake_filesystem_unittest.py PASSED [ 60%]
tests/featured/featured_test.py::TestFeatureHandler::test_sync_state_field_03_SingleToRCase_DHCP_Relay_Enabled <- ../../../usr/local/lib/python3.9/dist-packages/pyfakefs/fake_filesystem_unittest.py PASSED [ 61%]
tests/featured/featured_test.py::TestFeatureHandler::test_sync_state_field_04_DualTorCaseWithNoSystemCalls <- ../../../usr/local/lib/python3.9/dist-packages/pyfakefs/fake_filesystem_unittest.py PASSED [ 62%]
tests/featured/featured_test.py::TestFeatureHandler::test_sync_state_field_05_Chassis_Supervisor_PACKET <- ../../../usr/local/lib/python3.9/dist-packages/pyfakefs/fake_filesystem_unittest.py PASSED [ 62%]
tests/featured/featured_test.py::TestFeatureHandler::test_sync_state_field_06_Chassis_Supervisor_VOQ <- ../../../usr/local/lib/python3.9/dist-packages/pyfakefs/fake_filesystem_unittest.py PASSED [ 63%]
tests/featured/featured_test.py::TestFeatureHandler::test_sync_state_field_07_Chassis_LineCard_VOQ <- ../../../usr/local/lib/python3.9/dist-packages/pyfakefs/fake_filesystem_unittest.py PASSED [ 64%]
tests/featured/featured_test.py::TestFeatureHandler::test_sync_state_field_08_Chassis_LineCard_Packet <- ../../../usr/local/lib/python3.9/dist-packages/pyfakefs/fake_filesystem_unittest.py PASSED [ 65%]
tests/featured/featured_test.py::TestFeatureHandler::test_sync_state_field_09_Chassis_Supervisor_PACKET_multinpu <- ../../../usr/local/lib/python3.9/dist-packages/pyfakefs/fake_filesystem_unittest.py PASSED [ 66%]
tests/featured/featured_test.py::TestFeatureHandler::test_sync_state_field_10_Chassis_LineCard_VOQ_multinpu <- ../../../usr/local/lib/python3.9/dist-packages/pyfakefs/fake_filesystem_unittest.py PASSED [ 67%]
tests/featured/featured_test.py::TestFeatureDaemon::test_advanced_reboot PASSED [ 68%]
tests/featured/featured_test.py::TestFeatureDaemon::test_delayed_service PASSED [ 68%]
tests/featured/featured_test.py::TestFeatureDaemon::test_feature_events PASSED [ 69%]
tests/featured/featured_test.py::TestFeatureDaemon::test_portinit_timeout PASSED [ 70%]

Signed-off-by: Vivek Reddy <vkarri@nvidia.com>
Signed-off-by: Vivek Reddy <vkarri@nvidia.com>
Signed-off-by: Vivek Reddy <vkarri@nvidia.com>
syslog.syslog(syslog.LOG_ERR, "error returned by select")
continue

fd = selectable_.getFd()
Copy link
Contributor

@qiluo-msft qiluo-msft Jul 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getFd

The old implementation does not use getFd(). Could you explain why you need it? Is it a new requirement? #Closed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The old implementation was using ConfigDbConnector. Featured is using SubscriberStateTable.

I'm saving a map b/w SubscriberStateTable <-> fd and so that when When select returns, i can use the fd to find the corresponding Table and act accordingly.

@qiluo-msft
Copy link
Contributor

Please resolve conflict.

@rlhui rlhui requested review from abdosi and judyjoseph July 22, 2023 03:03
Signed-off-by: Vivek Reddy <vkarri@nvidia.com>
@dgsudharsan
Copy link
Contributor

@qiluo-msft Conflict resolved. Can you please check?

@qiluo-msft qiluo-msft merged commit 6767bc7 into sonic-net:master Jul 28, 2023
4 checks passed
qiluo-msft pushed a commit to sonic-net/sonic-buildimage that referenced this pull request Aug 4, 2023
…module (#15815)

### Why I did it

- Hostcfgd is handling a lot of tasks and Feature table is by itself an important and big task which can benefit from separation into a new daemon
- Currently, Hostcfgd handles feature table first before other tables an thus other taska such as Aaa, Ntp are delayed. With the split, they can run in paralell
- After the recent config-reload enhancements, Hostcfgd uses a multi-threading approach to listen to PortInitDone. BY splitting the daemon into two, we can avoid having a separate thread by using SubscriberStateTable and Select,.

#### Note: 

Depends on host-services PR : sonic-net/sonic-host-services#71
Once the host-services is merged, updating the submodule along with this PR should fix the CI problem

#### How I did it

Refactor the feature related tasks from hostcfgd into a seperate daemon.

#### How to verify it

UT's and Tested on DUT

```
admin@r-tigris-22:~$ show logging -f | grep featured
Jun 28 22:13:33.870021 r-tigris-22 INFO featured: ConfigDB connect success
Jun 28 22:14:05.638063 r-tigris-22 INFO featured: Updating feature 'radv' systemd config file related to auto-restart ...
Jun 28 22:14:06.169184 r-tigris-22 INFO featured: Feature radv is enabled and started
Jun 28 22:14:06.172343 r-tigris-22 INFO featured: Updating feature 'sflow' systemd config file related to auto-restart ...
Jun 28 22:14:06.844322 r-tigris-22 INFO featured: Feature sflow is stopped and disabled
Jun 28 22:14:06.846761 r-tigris-22 INFO featured: Updating feature 'snmp' systemd config file related to auto-restart ...
Jun 28 22:14:07.129090 r-tigris-22 INFO featured: Feature is snmp delayed for port init
Jun 28 22:14:07.132052 r-tigris-22 INFO featured: Updating feature 'swss' systemd config file related to auto-restart ...
Jun 28 22:14:08.368948 r-tigris-22 INFO featured: Feature swss is enabled and started
Jun 28 22:14:08.369240 r-tigris-22 INFO featured: Updating feature 'syncd' systemd config file related to auto-restart ...
Jun 28 22:14:08.718357 r-tigris-22 INFO featured: Feature syncd is enabled and started
Jun 28 22:14:08.721496 r-tigris-22 INFO featured: Updating feature 'teamd' systemd config file related to auto-restart ...
Jun 28 22:14:09.042495 r-tigris-22 INFO featured: Feature teamd is enabled and started
Jun 28 22:14:09.045441 r-tigris-22 INFO featured: Updating feature 'telemetry' systemd config file related to auto-restart ...
Jun 28 22:14:09.359831 r-tigris-22 INFO featured: Feature is telemetry delayed for port init
Jun 28 22:14:30.740499 r-tigris-22 INFO featured: Updating delayed features after port initialization
Jun 28 22:14:33.914178 r-tigris-22 INFO featured: Feature lldp is enabled and started
Jun 28 22:14:35.536264 r-tigris-22 INFO featured: Feature mgmt-framework is enabled and started
Jun 28 22:14:38.098571 r-tigris-22 INFO featured: Feature snmp is enabled and started
Jun 28 22:14:39.555727 r-tigris-22 INFO featured: Feature telemetry is enabled and started


Jun 28 22:13:33.977011 r-tigris-22 INFO hostcfgd: ConfigDB connect success
Jun 28 22:13:33.993878 r-tigris-22 INFO hostcfgd: Waiting for systemctl to finish initialization
Jun 28 22:13:34.274818 r-tigris-22 INFO hostcfgd: systemctl has finished initialization -- proceeding ...
Jun 28 22:13:34.391623 r-tigris-22 INFO hostcfgd: file size check pass: /etc/pam.d/sshd size is (2139) bytes
Jun 28 22:13:34.427273 r-tigris-22 INFO hostcfgd: file size check pass: /etc/pam.d/login size is (4132) bytes
Jun 28 22:13:34.433390 r-tigris-22 INFO hostcfgd: file size check pass: /etc/nsswitch.conf size is (494) bytes
Jun 28 22:13:34.455110 r-tigris-22 INFO hostcfgd: file size check pass: /etc/nsswitch.conf size is (494) bytes
Jun 28 22:13:34.478882 r-tigris-22 INFO hostcfgd: Found audisp-tacplus PID: 442
Jun 28 22:13:34.482365 r-tigris-22 INFO hostcfgd: cmd - ['service', 'aaastatsd', 'stop']
Jun 28 22:13:36.108569 r-tigris-22 INFO hostcfgd: NtpCfg load ...
Jun 28 22:13:36.108699 r-tigris-22 INFO hostcfgd: ntp server update key 0
Jun 28 22:13:36.108763 r-tigris-22 INFO hostcfgd: ntp server update, restarting ntp-config, ntp servers configured set()
Jun 28 22:14:06.691693 r-tigris-22 INFO hostcfgd: KdumpCfg init ...
Jun 28 22:14:06.691771 r-tigris-22 DEBUG hostcfgd: passw_policies_update - key: POLICIES
Jun 28 22:14:06.691832 r-tigris-22 DEBUG hostcfgd: passw_policies_update - data: {'digits_class': 'true', 'expiration': '180', 'expiration_warning': '15', 'history_cnt': '10', 'len_min': '8', 'lower_class': 'true', 'reject_user_passw_match': 'true', 'special_class': 'true', 'state': 'disabled', 'upper_class': 'true'}
Jun 28 22:14:06.691891 r-tigris-22 DEBUG hostcfgd: modify_conf_file: passw_policies - {'digits_class': True, 'expiration': '180', 'expiration_warning': '15', 'history_cnt': '10', 'len_min': '8', 'lower_class': True, 'reject_user_passw_match': True, 'special_class': True, 'state': 'disabled', 'upper_class': True}
Jun 28 22:14:06.701982 r-tigris-22 DEBUG hostcfgd: Initial hostname: r-tigris-22
Jun 28 22:14:06.702075 r-tigris-22 DEBUG hostcfgd: Initial mgmt interface conf: {('eth0', '10.210.24.108/22'): {'gwaddr': '10.210.24.1'}}
Jun 28 22:14:06.702115 r-tigris-22 DEBUG hostcfgd: Initial mgmt VRF state: 
Jun 28 22:14:06.702177 r-tigris-22 INFO hostcfgd: RSyslogCfg: Initial config: {'config': {'GLOBAL': {'rate_limit_burst': '0', 'rate_limit_interval': '0'}}, 'servers': {}}
Jun 28 22:14:06.709455 r-tigris-22 INFO hostcfgd[39326]: Failed to restart resolv-config.service: Unit resolv-config.service not found.
Jun 28 22:14:06.709560 r-tigris-22 ERR hostcfgd: ['systemctl', 'restart', 'resolv-config'] - failed: return code - 5, output:#012None
admin@r-tigris-22:~$ Connection to r-tigris-22 closed by remote host.
```
sonic-otn pushed a commit to sonic-otn/sonic-buildimage that referenced this pull request Sep 20, 2023
…module (sonic-net#15815)

### Why I did it

- Hostcfgd is handling a lot of tasks and Feature table is by itself an important and big task which can benefit from separation into a new daemon
- Currently, Hostcfgd handles feature table first before other tables an thus other taska such as Aaa, Ntp are delayed. With the split, they can run in paralell
- After the recent config-reload enhancements, Hostcfgd uses a multi-threading approach to listen to PortInitDone. BY splitting the daemon into two, we can avoid having a separate thread by using SubscriberStateTable and Select,.

#### Note: 

Depends on host-services PR : sonic-net/sonic-host-services#71
Once the host-services is merged, updating the submodule along with this PR should fix the CI problem

#### How I did it

Refactor the feature related tasks from hostcfgd into a seperate daemon.

#### How to verify it

UT's and Tested on DUT

```
admin@r-tigris-22:~$ show logging -f | grep featured
Jun 28 22:13:33.870021 r-tigris-22 INFO featured: ConfigDB connect success
Jun 28 22:14:05.638063 r-tigris-22 INFO featured: Updating feature 'radv' systemd config file related to auto-restart ...
Jun 28 22:14:06.169184 r-tigris-22 INFO featured: Feature radv is enabled and started
Jun 28 22:14:06.172343 r-tigris-22 INFO featured: Updating feature 'sflow' systemd config file related to auto-restart ...
Jun 28 22:14:06.844322 r-tigris-22 INFO featured: Feature sflow is stopped and disabled
Jun 28 22:14:06.846761 r-tigris-22 INFO featured: Updating feature 'snmp' systemd config file related to auto-restart ...
Jun 28 22:14:07.129090 r-tigris-22 INFO featured: Feature is snmp delayed for port init
Jun 28 22:14:07.132052 r-tigris-22 INFO featured: Updating feature 'swss' systemd config file related to auto-restart ...
Jun 28 22:14:08.368948 r-tigris-22 INFO featured: Feature swss is enabled and started
Jun 28 22:14:08.369240 r-tigris-22 INFO featured: Updating feature 'syncd' systemd config file related to auto-restart ...
Jun 28 22:14:08.718357 r-tigris-22 INFO featured: Feature syncd is enabled and started
Jun 28 22:14:08.721496 r-tigris-22 INFO featured: Updating feature 'teamd' systemd config file related to auto-restart ...
Jun 28 22:14:09.042495 r-tigris-22 INFO featured: Feature teamd is enabled and started
Jun 28 22:14:09.045441 r-tigris-22 INFO featured: Updating feature 'telemetry' systemd config file related to auto-restart ...
Jun 28 22:14:09.359831 r-tigris-22 INFO featured: Feature is telemetry delayed for port init
Jun 28 22:14:30.740499 r-tigris-22 INFO featured: Updating delayed features after port initialization
Jun 28 22:14:33.914178 r-tigris-22 INFO featured: Feature lldp is enabled and started
Jun 28 22:14:35.536264 r-tigris-22 INFO featured: Feature mgmt-framework is enabled and started
Jun 28 22:14:38.098571 r-tigris-22 INFO featured: Feature snmp is enabled and started
Jun 28 22:14:39.555727 r-tigris-22 INFO featured: Feature telemetry is enabled and started


Jun 28 22:13:33.977011 r-tigris-22 INFO hostcfgd: ConfigDB connect success
Jun 28 22:13:33.993878 r-tigris-22 INFO hostcfgd: Waiting for systemctl to finish initialization
Jun 28 22:13:34.274818 r-tigris-22 INFO hostcfgd: systemctl has finished initialization -- proceeding ...
Jun 28 22:13:34.391623 r-tigris-22 INFO hostcfgd: file size check pass: /etc/pam.d/sshd size is (2139) bytes
Jun 28 22:13:34.427273 r-tigris-22 INFO hostcfgd: file size check pass: /etc/pam.d/login size is (4132) bytes
Jun 28 22:13:34.433390 r-tigris-22 INFO hostcfgd: file size check pass: /etc/nsswitch.conf size is (494) bytes
Jun 28 22:13:34.455110 r-tigris-22 INFO hostcfgd: file size check pass: /etc/nsswitch.conf size is (494) bytes
Jun 28 22:13:34.478882 r-tigris-22 INFO hostcfgd: Found audisp-tacplus PID: 442
Jun 28 22:13:34.482365 r-tigris-22 INFO hostcfgd: cmd - ['service', 'aaastatsd', 'stop']
Jun 28 22:13:36.108569 r-tigris-22 INFO hostcfgd: NtpCfg load ...
Jun 28 22:13:36.108699 r-tigris-22 INFO hostcfgd: ntp server update key 0
Jun 28 22:13:36.108763 r-tigris-22 INFO hostcfgd: ntp server update, restarting ntp-config, ntp servers configured set()
Jun 28 22:14:06.691693 r-tigris-22 INFO hostcfgd: KdumpCfg init ...
Jun 28 22:14:06.691771 r-tigris-22 DEBUG hostcfgd: passw_policies_update - key: POLICIES
Jun 28 22:14:06.691832 r-tigris-22 DEBUG hostcfgd: passw_policies_update - data: {'digits_class': 'true', 'expiration': '180', 'expiration_warning': '15', 'history_cnt': '10', 'len_min': '8', 'lower_class': 'true', 'reject_user_passw_match': 'true', 'special_class': 'true', 'state': 'disabled', 'upper_class': 'true'}
Jun 28 22:14:06.691891 r-tigris-22 DEBUG hostcfgd: modify_conf_file: passw_policies - {'digits_class': True, 'expiration': '180', 'expiration_warning': '15', 'history_cnt': '10', 'len_min': '8', 'lower_class': True, 'reject_user_passw_match': True, 'special_class': True, 'state': 'disabled', 'upper_class': True}
Jun 28 22:14:06.701982 r-tigris-22 DEBUG hostcfgd: Initial hostname: r-tigris-22
Jun 28 22:14:06.702075 r-tigris-22 DEBUG hostcfgd: Initial mgmt interface conf: {('eth0', '10.210.24.108/22'): {'gwaddr': '10.210.24.1'}}
Jun 28 22:14:06.702115 r-tigris-22 DEBUG hostcfgd: Initial mgmt VRF state: 
Jun 28 22:14:06.702177 r-tigris-22 INFO hostcfgd: RSyslogCfg: Initial config: {'config': {'GLOBAL': {'rate_limit_burst': '0', 'rate_limit_interval': '0'}}, 'servers': {}}
Jun 28 22:14:06.709455 r-tigris-22 INFO hostcfgd[39326]: Failed to restart resolv-config.service: Unit resolv-config.service not found.
Jun 28 22:14:06.709560 r-tigris-22 ERR hostcfgd: ['systemctl', 'restart', 'resolv-config'] - failed: return code - 5, output:#012None
admin@r-tigris-22:~$ Connection to r-tigris-22 closed by remote host.
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants