Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhanced Feature Table state enable/disable for multi-asic platforms. #5358

Merged
merged 7 commits into from
Sep 22, 2020

Conversation

abdosi
Copy link
Contributor

@abdosi abdosi commented Sep 10, 2020

What/Why I did:

Change 1:
Enhanced Feature Table state enable/disable for multi-asic platforms.
In Multi-asic for some features can have service per asic so enhanced
Feature Table Generation to have scope (global or per asic)
of each feature service.

In Single asic platforms all service will run in global/host scope
In Multi-asic services can run in global/host scope (eg snmp) , per_asic scope (eg swss, syncd, etc..) or both (eg lldp)

Without above change all feature/services were getting enable/disable in global scope also
for multi-asic platforms.

Change 2:
Also updated logic to return if any one of systemctl command return failure
and make sure syslog of feature getting enable/disable only come when
all commands are successful.

How I verify:
Verified feature enable/disable on both single and multi-asic platforms.

In Multi-asic for some features we can service per asic so we need to
get list of all services.

Also updated logic to return if any one of systemctl command return failure
and make sure syslog of feature getting enable/disable only come when
all commads are sucessful.

Moved the service list get api from sonic-util to sonic-py-common

Signed-off-by: Abhishek Dosi <abdosi@abdosi-ubuntu-vm0.nwp1qucpfg5ejooejenqshkj3e.cx.internal.cloudapp.net>
Abhishek Dosi added 2 commits September 10, 2020 20:03
Signed-off-by: Abhishek Dosi <abdosi@abdosi-ubuntu-vm0.nwp1qucpfg5ejooejenqshkj3e.cx.internal.cloudapp.net>
Signed-off-by: Abhishek Dosi <abdosi@abdosi-ubuntu-vm0.nwp1qucpfg5ejooejenqshkj3e.cx.internal.cloudapp.net>
@tahmed-dev
Copy link
Contributor

@abdosi, @jleveque : I am wondering if the change should be done in the hostcfgd. Another approach is to template the change into init_cfg.json.j2 and move the template rendering out of sonic_debian_extension.j2 ro rc.local under FIRST_BOOT_FILE flag there. the reason is to keep hostcfg as simple and multi-asic agnostic.

@jleveque
Copy link
Contributor

the reason is to keep hostcfg as simple and multi-asic agnostic.

I agree with this sentiment. I can't think of a reason we need to build init_cfg.json into the image. We could generate it on first boot.

@abdosi
Copy link
Contributor Author

abdosi commented Sep 11, 2020

the reason is to keep hostcfg as simple and multi-asic agnostic.

I agree with this sentiment. I can't think of a reason we need to build init_cfg.json into the image. We could generate it on first boot.

Thanks @tahmed-dev for the quick discussion. @jleveque generating init_cfg.json during boot time is not possible as it depends on some compile time variable. Also to make hostcfgd agnostic of multi-asic we will need to update Feature table to have service instance count for each feature for which we have to read again generated_service_conf file during first boot and can't be just template based.

@jleveque
Copy link
Contributor

@abdosi: Good points. I guess it isn't as easy to move the init_cfg.json generation to first boot as I hoped.

@abdosi
Copy link
Contributor Author

abdosi commented Sep 17, 2020

@jleveque and @tahmed-dev can you please help review this.

@tahmed-dev
Copy link
Contributor

the reason is to keep hostcfg as simple and multi-asic agnostic.

I agree with this sentiment. I can't think of a reason we need to build init_cfg.json into the image. We could generate it on first boot.

Thanks @tahmed-dev for the quick discussion. @jleveque generating init_cfg.json during boot time is not possible as it depends on some compile time variable. Also to make hostcfgd agnostic of multi-asic we will need to update Feature table to have service instance count for each feature for which we have to read again generated_service_conf file during first boot and can't be just template based.

@abdosi what about extending information about services in the init_cfg.json regarding the where the feature should run host/per namespace/both. The hostcfgd can then check the multi_npu flag and act accordingly.

Made init_cfg.json.j2 knowledegable of Feature
service is global scope or per asic scope

Signed-off-by: Abhishek Dosi <abdosi@abdosi-ubuntu-vm0.nwp1qucpfg5ejooejenqshkj3e.cx.internal.cloudapp.net>
@abdosi
Copy link
Contributor Author

abdosi commented Sep 19, 2020

the reason is to keep hostcfg as simple and multi-asic agnostic.

I agree with this sentiment. I can't think of a reason we need to build init_cfg.json into the image. We could generate it on first boot.

Thanks @tahmed-dev for the quick discussion. @jleveque generating init_cfg.json during boot time is not possible as it depends on some compile time variable. Also to make hostcfgd agnostic of multi-asic we will need to update Feature table to have service instance count for each feature for which we have to read again generated_service_conf file during first boot and can't be just template based.

@abdosi what about extending information about services in the init_cfg.json regarding the where the feature should run host/per namespace/both. The hostcfgd can then check the multi_npu flag and act accordingly.

@tahmed-dev Updated accordingly.

# Create feature name suffix depending feature is running in host or namespace or in both
feature_name_suffix_list = (([feature_name] if has_global_scope or not device_info.is_multi_npu() else []) +
([(feature_name + '@' + str(asic_inst)) for asic_inst in range(device_info.get_num_npus())
if has_per_asic_scope and device_info.is_multi_npu()]))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we make is_multi_npu() a class member?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tahmed-dev
tahmed-dev previously approved these changes Sep 19, 2020
Copy link
Contributor

@tahmed-dev tahmed-dev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. minor comment regarding the is_multi_npu being class member.

Abhishek Dosi added 2 commits September 19, 2020 19:26
Signed-off-by: Abhishek Dosi <abdosi@abdosi-ubuntu-vm0.nwp1qucpfg5ejooejenqshkj3e.cx.internal.cloudapp.net>
@abdosi abdosi merged commit 75e4258 into sonic-net:master Sep 22, 2020
@abdosi abdosi deleted the multiasic_featuretable branch September 22, 2020 18:29
abdosi added a commit that referenced this pull request Sep 22, 2020
…#5358)

* Enhanced Feature Table state enable/disbale for multi-asic platforms.
In Multi-asic for some features we can service per asic so we need to
get list of all services.

Also updated logic to return if any one of systemctl command return failure
and make sure syslog of feature getting enable/disable only come when
all commads are sucessful.

Moved the service list get api from sonic-util to sonic-py-common

Signed-off-by: Abhishek Dosi <abdosi@abdosi-ubuntu-vm0.nwp1qucpfg5ejooejenqshkj3e.cx.internal.cloudapp.net>

* Make sure to retun None for both service list in case of error.

Signed-off-by: Abhishek Dosi <abdosi@abdosi-ubuntu-vm0.nwp1qucpfg5ejooejenqshkj3e.cx.internal.cloudapp.net>

* Return empty list as fail condition

Signed-off-by: Abhishek Dosi <abdosi@abdosi-ubuntu-vm0.nwp1qucpfg5ejooejenqshkj3e.cx.internal.cloudapp.net>

* Address Review Comments.

Made init_cfg.json.j2 knowledegable of Feature
service is global scope or per asic scope

Signed-off-by: Abhishek Dosi <abdosi@abdosi-ubuntu-vm0.nwp1qucpfg5ejooejenqshkj3e.cx.internal.cloudapp.net>

* Fix merge conflict

* Address Review Comment.

Signed-off-by: Abhishek Dosi <abdosi@abdosi-ubuntu-vm0.nwp1qucpfg5ejooejenqshkj3e.cx.internal.cloudapp.net>

Co-authored-by: Abhishek Dosi <abdosi@abdosi-ubuntu-vm0.nwp1qucpfg5ejooejenqshkj3e.cx.internal.cloudapp.net>
Comment on lines +284 to +285
for suffix in feature_suffixes:
start_cmds.append("sudo systemctl unmask {}.{}".format(feature_name_suffix, suffix))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's great work. I know it merged already, but wanted to follow up the story here if we have plan to support enabling/disabling services for particular asics in system?

More specifically, says there are 2 asics in system, we want to disable lldp on asic 1, but not asic 2.

I imagine that to do that, we need

(1) Create entry like FEATURE|lldp|1 with enabled = false and FEATURE|lldp|2 with enabled = true. hostcfgd listens and reacts to the change. It will turn off lldp@1.service but keep lldp@2.service on.

Or (2) Create hostcfgd@1 for asic 1 and hostcfgd@2 for asic 2. hostcfgd@1 listens database@1 CONFIG_DB, and reacts to FEATURE|lldp. Same for hostcfgd@2 which listens database@2 CONFIG_DB. But this one may not work if we want to enable/disable database. If database doesn't exits, hostcfgd@x will have nothing to listen and react.

santhosh-kt pushed a commit to santhosh-kt/sonic-buildimage that referenced this pull request Feb 25, 2021
…sonic-net#5358)

* Enhanced Feature Table state enable/disbale for multi-asic platforms.
In Multi-asic for some features we can service per asic so we need to
get list of all services.

Also updated logic to return if any one of systemctl command return failure
and make sure syslog of feature getting enable/disable only come when
all commads are sucessful.

Moved the service list get api from sonic-util to sonic-py-common

Signed-off-by: Abhishek Dosi <abdosi@abdosi-ubuntu-vm0.nwp1qucpfg5ejooejenqshkj3e.cx.internal.cloudapp.net>

* Make sure to retun None for both service list in case of error.

Signed-off-by: Abhishek Dosi <abdosi@abdosi-ubuntu-vm0.nwp1qucpfg5ejooejenqshkj3e.cx.internal.cloudapp.net>

* Return empty list as fail condition

Signed-off-by: Abhishek Dosi <abdosi@abdosi-ubuntu-vm0.nwp1qucpfg5ejooejenqshkj3e.cx.internal.cloudapp.net>

* Address Review Comments.

Made init_cfg.json.j2 knowledegable of Feature
service is global scope or per asic scope

Signed-off-by: Abhishek Dosi <abdosi@abdosi-ubuntu-vm0.nwp1qucpfg5ejooejenqshkj3e.cx.internal.cloudapp.net>

* Fix merge conflict

* Address Review Comment.

Signed-off-by: Abhishek Dosi <abdosi@abdosi-ubuntu-vm0.nwp1qucpfg5ejooejenqshkj3e.cx.internal.cloudapp.net>

Co-authored-by: Abhishek Dosi <abdosi@abdosi-ubuntu-vm0.nwp1qucpfg5ejooejenqshkj3e.cx.internal.cloudapp.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants