Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Dynamic Buffer Calc] Avoid creating lossy PG for admin down ports during initialization #1776

Merged

Conversation

stephenxs
Copy link
Collaborator

@stephenxs stephenxs commented Jun 9, 2021

What I did

  • Avoid creating lossy PGs for admin down ports during system initialization
  • Make ADMIN_DOWN the default state of a port in dynamic buffer manager.
  • Explicitly remove configured PGs from an admin down port during initialization
    This is to make sure the items will be added to m_ready_list.
  • Use APPL_DB as the source of initial PGs and queues for port ready deducting for warm reboot

Signed-off-by: Stephen Sun stephens@nvidia.com

Why I did it
Lossy PGs for admin down ports should not be applied to APPL_DB.
Originally, we had the logic to remove lossy/lossless PGs when shutting a port. The same logic should be applied when the system is starting. Allocating lossy PG for an admin down port doesn't affect the system negatively except for wasting buffer. So we would like to resolve it.

How I verified it
Manually test

Details if related

  1. A port is admin down if admin_status in CONFIG_DB|PORT|<port> is down or doesn't exist. In case it doesn't exist, buffermgrd doesn't have a chance to read and initialize admin status in its internal data structure. To simplify the flow, we make PORT_ADMIN_DOWN the default state of the port.
  2. There is a logic in orchagent to make sure all configured PGs have been handled before the port initialization can proceed. The idea is to add check whether all configured PGs have been applied to SAI. As no PG will be applied to SAI on admin down ports, removing the PGs will notify the orchagent that the PGs have been handled.

- Don't create lossy PG for admin down ports
- Treat admin down ports as ready during orchagent initialization

Signed-off-by: Stephen Sun <stephens@nvidia.com>
- Check whether the PGs is removed after a port has been shut down
- No lossy/lossless PG added to APPL_DB when a port is admin down

Signed-off-by: Stephen Sun <stephens@nvidia.com>
@stephenxs stephenxs closed this Jun 16, 2021
@stephenxs stephenxs reopened this Jun 16, 2021
@dprital
Copy link
Collaborator

dprital commented Jun 16, 2021

This fix is required also for 202012. Need to add the tag for it,. Thanks.

@stephenxs
Copy link
Collaborator Author

This fix is required also for 202012. Need to add the tag for it,. Thanks.

Done

@liat-grozovik
Copy link
Collaborator

@neethajohn please help to review

orchagent/portsorch.cpp Outdated Show resolved Hide resolved
Explicitly remove PGs from an admin down port when a PG is configured during initialization
This is to make sure the items will be added to m_ready_list

Signed-off-by: Stephen Sun <stephens@nvidia.com>
@stephenxs stephenxs requested a review from prsunny as a code owner June 23, 2021 06:04
@stephenxs stephenxs requested a review from neethajohn June 23, 2021 06:20
@stephenxs
Copy link
Collaborator Author

Vs test failed due to environmental issue: “installing dependence “

neethajohn
neethajohn previously approved these changes Jun 23, 2021
@liat-grozovik
Copy link
Collaborator

@neethajohn should we wait for additional reviewer or can merge?

@neethajohn
Copy link
Contributor

@stephenxs , can you please update the PR description? We do not have the 'isPortReady' change now

@stephenxs
Copy link
Collaborator Author

@stephenxs , can you please update the PR description? We do not have the 'isPortReady' change now

Thanks for reminding. Done.

@stephenxs
Copy link
Collaborator Author

stephenxs commented Jun 26, 2021

The latest commit introduces an issue during warm-reboot.
We need to use items in APPL DB as the initial buffer PGs and queues for port ready deducting.
Check the comments in the initBufferReadyLists for details.

@stephenxs stephenxs marked this pull request as draft June 26, 2021 10:00
…_ref

This is to make sure admin down ports can be ready in warmreboot in dynamic model

Signed-off-by: Stephen Sun <stephens@nvidia.com>
@stephenxs stephenxs marked this pull request as ready for review June 27, 2021 05:51
@stephenxs stephenxs requested a review from neethajohn June 27, 2021 05:52
Copy link
Collaborator

@liat-grozovik liat-grozovik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stephenxs can you please confirm warmboot sonic-mgmt test is passing with this change as well?

orchagent/bufferorch.cpp Outdated Show resolved Hide resolved
cfgmgr/buffermgrdyn.h Show resolved Hide resolved
Signed-off-by: Stephen Sun <stephens@nvidia.com>
@stephenxs
Copy link
Collaborator Author

stephenxs commented Jun 28, 2021

@stephenxs can you please confirm warmboot sonic-mgmt test is passing with this change as well?

@liat-grozovik Passed on Spectrum and Spectrum 3 platforms. I didn't test it on Spectrum 2 platform as there isn't an available test bed.
This is not platform independent so I think it's safe for warm reboot.

@stephenxs
Copy link
Collaborator Author

The vs test failures have been fixed by PR #1807

@liat-grozovik liat-grozovik merged commit 4c8e2b5 into sonic-net:master Jun 28, 2021
@stephenxs stephenxs deleted the dont-create-lossy-pg-admin-down-boot branch June 28, 2021 22:22
qiluo-msft pushed a commit that referenced this pull request Jun 29, 2021
…ring initialization (#1776)

- What I did
Avoid creating lossy PGs for admin down ports during system initialization
Make ADMIN_DOWN the default state of a port in dynamic buffer manager.
Explicitly remove configured PGs from an admin down port during initialization
This is to make sure the items will be added to m_ready_list.
Use APPL_DB as the source of initial PGs and queues for port ready deducting for warm reboot
Signed-off-by: Stephen Sun stephens@nvidia.com

- Why I did it
Lossy PGs for admin down ports should not be applied to APPL_DB.
Originally, we had the logic to remove lossy/lossless PGs when shutting a port. The same logic should be applied when the system is starting. Allocating lossy PG for an admin down port doesn't affect the system negatively except for wasting buffer. So we would like to resolve it.

- How I verified it
Manually test
liat-grozovik pushed a commit to sonic-net/sonic-buildimage that referenced this pull request Jun 29, 2021
Advance submodule head for sonic-swss

3226163 [BufferOrch] Don't call SAI API for BUFFER_POOL/PROFILE handling in case the op is DEL and the SAI OID is NULL (sonic-net/sonic-swss#1786)
6c88e47 [Dynamic Buffer Calc][Mellanox] Bug fixes and enhancements for the lua plugins for buffer pool calculation and headroom checking (sonic-net/sonic-swss#1781)
e86b900 [MPLS] sonic-swss changes for MPLS (sonic-net/sonic-swss#1686)
4c8e2b5 [Dynamic Buffer Calc] Avoid creating lossy PG for admin down ports during initialization (sonic-net/sonic-swss#1776)
3602124 [VS test stability] Skip flaky test for DPB (sonic-net/sonic-swss#1807)
c37cc1c Support for in-band-mgmt via management VRF (sonic-net/sonic-swss#1726)
1e3a532 Fix config prompt question issue (sonic-net/sonic-swss#1799)

Signed-off-by: Stephen Sun <stephens@nvidia.com>
liat-grozovik pushed a commit to sonic-net/sonic-buildimage that referenced this pull request Jun 29, 2021
Advance submodule head for sonic-swss on 202012

bb383be2 [Dynamic Buffer Calc][Mellanox] Bug fixes and enhancements for the lua plugins for buffer pool calculation and headroom checking (sonic-net/sonic-swss#1781)
f949dfe9 [Dynamic Buffer Calc] Avoid creating lossy PG for admin down ports during initialization (sonic-net/sonic-swss#1776)
def0a914 Fix config prompt question issue (sonic-net/sonic-swss#1799)
21f97506 [ci]: Merge azure pipelines from master to 202012 branch (sonic-net/sonic-swss#1764)
a83a2a42 [vstest]: add dvs_route fixture
849bdf9c [Mux] Add support for mux metrics to State DB (sonic-net/sonic-swss#1757)
386de717 [qosorch] Dot1p map list initialization fix (sonic-net/sonic-swss#1746)
f99abdca [sub intf] Port object reference count update (sonic-net/sonic-swss#1712)
4a00042d [vstest/nhg]: use dvs_route fixture to make test_nhg more robust

Signed-off-by: Stephen Sun <stephens@nvidia.com>
carl-nokia pushed a commit to carl-nokia/sonic-buildimage that referenced this pull request Aug 7, 2021
Advance submodule head for sonic-swss

3226163 [BufferOrch] Don't call SAI API for BUFFER_POOL/PROFILE handling in case the op is DEL and the SAI OID is NULL (sonic-net/sonic-swss#1786)
6c88e47 [Dynamic Buffer Calc][Mellanox] Bug fixes and enhancements for the lua plugins for buffer pool calculation and headroom checking (sonic-net/sonic-swss#1781)
e86b900 [MPLS] sonic-swss changes for MPLS (sonic-net/sonic-swss#1686)
4c8e2b5 [Dynamic Buffer Calc] Avoid creating lossy PG for admin down ports during initialization (sonic-net/sonic-swss#1776)
3602124 [VS test stability] Skip flaky test for DPB (sonic-net/sonic-swss#1807)
c37cc1c Support for in-band-mgmt via management VRF (sonic-net/sonic-swss#1726)
1e3a532 Fix config prompt question issue (sonic-net/sonic-swss#1799)

Signed-off-by: Stephen Sun <stephens@nvidia.com>
raphaelt-nvidia pushed a commit to raphaelt-nvidia/sonic-swss that referenced this pull request Oct 5, 2021
…ring initialization (sonic-net#1776)

- What I did
Avoid creating lossy PGs for admin down ports during system initialization
Make ADMIN_DOWN the default state of a port in dynamic buffer manager.
Explicitly remove configured PGs from an admin down port during initialization
This is to make sure the items will be added to m_ready_list.
Use APPL_DB as the source of initial PGs and queues for port ready deducting for warm reboot
Signed-off-by: Stephen Sun stephens@nvidia.com

- Why I did it
Lossy PGs for admin down ports should not be applied to APPL_DB.
Originally, we had the logic to remove lossy/lossless PGs when shutting a port. The same logic should be applied when the system is starting. Allocating lossy PG for an admin down port doesn't affect the system negatively except for wasting buffer. So we would like to resolve it.

- How I verified it
Manually test
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants