-
Notifications
You must be signed in to change notification settings - Fork 544
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Dynamic Buffer Calc][Mellanox] Bug fixes and enhancements for the lua plugins for buffer pool calculation and headroom checking #1781
Conversation
This PR contains the fixes of 3 bugs and 1 enhancement. In theory, we open 1 PR for 1 thing. But there are dependencies between all fixes, which makes it very difficult to split it. |
7a2a4c3
to
d8cfaa3
Compare
@neethajohn can you please help to review? |
|
446d92a
to
da6a40d
Compare
- Take number of lanes instead of speed into account when determining whether it has doubled pipeline latency For speeds other than 400G, eg 100G, it's possible that some 100G ports have 8 lanes and others have 4 lanes In this case, we need to add "8_lane" to the profile name to indicate whether the profile is for 8 lane ports or normal ports This is for Mellanox platform only - Take advantage of "set" feature of the lua to present the profile referencing count, which also makes the code more maintainable - Take deviation into account when checking the headroom against the limit - Take private headroom into account when shared headroom pool is enabled Signed-off-by: Stephen Sun <stephens@nvidia.com>
da6a40d
to
f3f2c75
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a sonic-mgmt tests for this 8 lane profile
…lane port Signed-off-by: Stephen Sun <stephens@nvidia.com>
In almost every test cases, there is a logic to check whether the buffer profile in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you read the number of lanes and make it more generic and not Spectrum3 only?
Signed-off-by: Stephen Sun <stephens@nvidia.com>
fixed by removing the dependency on ASIC type. |
Signed-off-by: Stephen Sun <stephens@nvidia.com>
VS test failed due to environmental issue. Need rerun |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
Failed by dynamic port breakdown
|
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
…a plugins for buffer pool calculation and headroom checking (#1781) What I did Bug fixes for buffer pool calculation and headroom checking on Mellanox platforms. Test the number of lanes instead of the speed when determining whether special handling is required for a port. For speeds other than 400G, eg 100G, it's possible that some 100G ports have 8 lanes and others have 4 lanes, which means they can not share the same buffer profile. A suffix _8lane is introduced to indicate it, like pg_lossless_100000_5m_8lane_profile Take the private headroom into account when calculating the buffer pool size Take deviation into account when checking the headroom against the per-port limit to avoid the inaccurate result in a rare case Use hashtable to record the reference count of a profile in lug plugin Signed-off-by: Stephen Sun stephens@nvidia.com How I verified it Run regression and manually test Details if related Test the number of lanes instead of the speed when determining whether special handling (double headroom size) is required for a port. Originally, it was determined by testing whether the ports' speed is 400G but that is not accurate. A user can configure a port with 8 lanes to 100G. In this case, special handling is still required for a port that is not 400G. So we need to adjust the way to do that. The variable names are also updated accordingly: xxx_400g => xxx_8lanes Take deviation into account when checking the headroom against the per-port limit to avoid the inaccurate result in a rare case There are some deviations that make the accumulative headroom a bit larger than the quantity calculated by the buffer manager. We need to take it into account when calculating the accumulative headroom.
Advance submodule head for sonic-swss 3226163 [BufferOrch] Don't call SAI API for BUFFER_POOL/PROFILE handling in case the op is DEL and the SAI OID is NULL (sonic-net/sonic-swss#1786) 6c88e47 [Dynamic Buffer Calc][Mellanox] Bug fixes and enhancements for the lua plugins for buffer pool calculation and headroom checking (sonic-net/sonic-swss#1781) e86b900 [MPLS] sonic-swss changes for MPLS (sonic-net/sonic-swss#1686) 4c8e2b5 [Dynamic Buffer Calc] Avoid creating lossy PG for admin down ports during initialization (sonic-net/sonic-swss#1776) 3602124 [VS test stability] Skip flaky test for DPB (sonic-net/sonic-swss#1807) c37cc1c Support for in-band-mgmt via management VRF (sonic-net/sonic-swss#1726) 1e3a532 Fix config prompt question issue (sonic-net/sonic-swss#1799) Signed-off-by: Stephen Sun <stephens@nvidia.com>
Advance submodule head for sonic-swss on 202012 bb383be2 [Dynamic Buffer Calc][Mellanox] Bug fixes and enhancements for the lua plugins for buffer pool calculation and headroom checking (sonic-net/sonic-swss#1781) f949dfe9 [Dynamic Buffer Calc] Avoid creating lossy PG for admin down ports during initialization (sonic-net/sonic-swss#1776) def0a914 Fix config prompt question issue (sonic-net/sonic-swss#1799) 21f97506 [ci]: Merge azure pipelines from master to 202012 branch (sonic-net/sonic-swss#1764) a83a2a42 [vstest]: add dvs_route fixture 849bdf9c [Mux] Add support for mux metrics to State DB (sonic-net/sonic-swss#1757) 386de717 [qosorch] Dot1p map list initialization fix (sonic-net/sonic-swss#1746) f99abdca [sub intf] Port object reference count update (sonic-net/sonic-swss#1712) 4a00042d [vstest/nhg]: use dvs_route fixture to make test_nhg more robust Signed-off-by: Stephen Sun <stephens@nvidia.com>
Advance submodule head for sonic-swss 3226163 [BufferOrch] Don't call SAI API for BUFFER_POOL/PROFILE handling in case the op is DEL and the SAI OID is NULL (sonic-net/sonic-swss#1786) 6c88e47 [Dynamic Buffer Calc][Mellanox] Bug fixes and enhancements for the lua plugins for buffer pool calculation and headroom checking (sonic-net/sonic-swss#1781) e86b900 [MPLS] sonic-swss changes for MPLS (sonic-net/sonic-swss#1686) 4c8e2b5 [Dynamic Buffer Calc] Avoid creating lossy PG for admin down ports during initialization (sonic-net/sonic-swss#1776) 3602124 [VS test stability] Skip flaky test for DPB (sonic-net/sonic-swss#1807) c37cc1c Support for in-band-mgmt via management VRF (sonic-net/sonic-swss#1726) 1e3a532 Fix config prompt question issue (sonic-net/sonic-swss#1799) Signed-off-by: Stephen Sun <stephens@nvidia.com>
…a plugins for buffer pool calculation and headroom checking (sonic-net#1781) What I did Bug fixes for buffer pool calculation and headroom checking on Mellanox platforms. Test the number of lanes instead of the speed when determining whether special handling is required for a port. For speeds other than 400G, eg 100G, it's possible that some 100G ports have 8 lanes and others have 4 lanes, which means they can not share the same buffer profile. A suffix _8lane is introduced to indicate it, like pg_lossless_100000_5m_8lane_profile Take the private headroom into account when calculating the buffer pool size Take deviation into account when checking the headroom against the per-port limit to avoid the inaccurate result in a rare case Use hashtable to record the reference count of a profile in lug plugin Signed-off-by: Stephen Sun stephens@nvidia.com How I verified it Run regression and manually test Details if related Test the number of lanes instead of the speed when determining whether special handling (double headroom size) is required for a port. Originally, it was determined by testing whether the ports' speed is 400G but that is not accurate. A user can configure a port with 8 lanes to 100G. In this case, special handling is still required for a port that is not 400G. So we need to adjust the way to do that. The variable names are also updated accordingly: xxx_400g => xxx_8lanes Take deviation into account when checking the headroom against the per-port limit to avoid the inaccurate result in a rare case There are some deviations that make the accumulative headroom a bit larger than the quantity calculated by the buffer manager. We need to take it into account when calculating the accumulative headroom.
#### What I did To support loading configuration data in yang schema, the `config load` command is enchanced with the below options - `-t` `--file-format` to specify the file-format. The config file can be `yang` or `config_db` format - `-r` to restart the services. Currently this option is supported for yang file format only. - #### How I did it Add the above mentioned cli options. Add Unit tests #### How to verify it Verify the command on VS. ``` admin@vlab-01:~$ sudo config load -y -c yang -r /etc/sonic/yang_cfg.json Disabling container monitoring ... Stopping SONiC target ... Running command: /usr/local/bin/sonic-cfggen -H -Y /etc/sonic/yang_cfg.json -j /etc/sonic/init_cfg.json --write-to-db Restarting SONiC target ... Enabling container monitoring ... Reloading Monit configuration ... Reinitializing monit daemon Please note setting loaded from minigraph will be lost after system reboot.To preserve setting, run `config save`. admin@vlab-01:~$ sudo config load -y -c yang /etc/sonic/yang_cfg.json Running command: /usr/local/bin/sonic-cfggen -H -Y /etc/sonic/yang_cfg.json -j /etc/sonic/init_cfg.json --write-to-db Please note setting loaded from minigraph will be lost after system reboot.To preserve setting, run `config save`. admin@vlab-01:~$ sudo config load Load config in config_db format from the default config file(s) ? [y/N]: y Running command: /usr/local/bin/sonic-cfggen -j /etc/sonic/config_db.json --write-to-db admin@vlab-01:~$ sudo config load -y Running command: /usr/local/bin/sonic-cfggen -j /etc/sonic/config_db.json --write-to-db ```
What I did
Bug fixes for buffer pool calculation and headroom checking on Mellanox platforms.
For speeds other than 400G, eg 100G, it's possible that some 100G ports have 8 lanes and others have 4 lanes,
which means they can not share the same buffer profile.
A suffix
_8lane
is introduced to indicate it, likepg_lossless_100000_5m_8lane_profile
Signed-off-by: Stephen Sun stephens@nvidia.com
Why I did it
How I verified it
Run regression and manually test
Details if related
Originally, it was determined by testing whether the ports' speed is 400G but that is not accurate. A user can configure a port with 8 lanes to 100G. In this case, special handling is still required for a port that is not 400G.
So we need to adjust the way to do that.
The variable names are also updated accordingly:
xxx_400g
=>xxx_8lanes
There are some deviations that make the accumulative headroom a bit larger than the quantity calculated by the buffer manager. We need to take it into account when calculating the accumulative headroom.