Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[database-chassis][lagid] Initialize SYSTEM_LAG_IDS_FREE_LIST in CHASSIS_APP_DB #20369

Merged
merged 1 commit into from
Nov 22, 2024

Conversation

mlok-nokia
Copy link
Contributor

@mlok-nokia mlok-nokia commented Sep 28, 2024

Why I did it

To address the issue of the same lagid could be used by two Portchannels in two different linecards. This issue occurs when reboot many Linecards together with 20 seconds delay in each LC reboot.

Work item tracking
  • Microsoft ADO (number only):

How I did it

  1. Modify database.sh to create a initial SYSTEM_LAG_IDS_FREE_LIST in the CHASSIS_APP_DB on SUP during database-chassis startup
  2. Modify the database consistency check in swss.sh to append the lagid to the end of SYSTEM_LAG_IDS_FREE_LIST when lagid is released.
  3. Modify the lag_id_end=1023 (not 1024) in chassisdb.conf since BCM supports the large lagid is 1023

This PR works with the following two PRs:
sonic-net/sonic-swss#3303
sonic-net/sonic-platform-daemons#542

Based on the dependency, the below order to merge these 3 PRs can help to avoid breaking the image run:
First: PR #20369 (This PR)
second: sonic-net/sonic-swss#3303
Third: sonic-net/sonic-platform-daemons#542

How to verify it

Which release branch to backport (provide reason below if selected)

  • 201811
  • 201911
  • 202006
  • 202012
  • 202106
  • 202111
  • 202205
  • 202211
  • 202305
  • 202405

Tested branch (Please provide the tested image version)

Description for the changelog

Link to config_db schema for YANG module changes

A picture of a cute animal (not mandatory but encouraged)

…SIS_APP_DB

Signed-off-by: mlok <marty.lok@nokia.com>
@mlok-nokia mlok-nokia marked this pull request as ready for review October 2, 2024 19:42
@mlok-nokia mlok-nokia requested a review from lguohan as a code owner October 2, 2024 19:42
@mlok-nokia
Copy link
Contributor Author

@arlakshm @judyjoseph Please review these 3 PRs. Thanks

@arlakshm
Copy link
Contributor

/azpw ms_conflict

@arlakshm
Copy link
Contributor

/azpw ms_conflict

1 similar comment
@rlhui
Copy link
Contributor

rlhui commented Nov 22, 2024

/azpw ms_conflict

@rlhui rlhui merged commit 36895fb into sonic-net:master Nov 22, 2024
23 checks passed
rlhui pushed a commit to sonic-net/sonic-platform-daemons that referenced this pull request Nov 27, 2024
When LC is absent for 30 minutes, the database cleanup kicks in. When LagId is released, it needs to be appended to the SYSTEM_LAG_IDS_FREE_LIST

This PR works with the following 2 PRs:
sonic-net/sonic-swss#3303
sonic-net/sonic-buildimage#20369

Signed-off-by: mlok <marty.lok@nokia.com>
mssonicbld pushed a commit to mssonicbld/sonic-platform-daemons that referenced this pull request Nov 30, 2024
…net#542)

When LC is absent for 30 minutes, the database cleanup kicks in. When LagId is released, it needs to be appended to the SYSTEM_LAG_IDS_FREE_LIST

This PR works with the following 2 PRs:
sonic-net/sonic-swss#3303
sonic-net/sonic-buildimage#20369

Signed-off-by: mlok <marty.lok@nokia.com>
mssonicbld pushed a commit to mssonicbld/sonic-buildimage that referenced this pull request Nov 30, 2024
…SIS_APP_DB (sonic-net#20369)

Modify database.sh to create a initial SYSTEM_LAG_IDS_FREE_LIST in the CHASSIS_APP_DB on SUP during database-chassis startup
Modify the database consistency check in swss.sh to append the lagid to the end of SYSTEM_LAG_IDS_FREE_LIST when lagid is released.
Modify the lag_id_end=1023 (not 1024) in chassisdb.conf since BCM supports the large lagid is 1023

Signed-off-by: mlok <marty.lok@nokia.com>
@mssonicbld
Copy link
Collaborator

Cherry-pick PR to 202405: #20975

mssonicbld pushed a commit to sonic-net/sonic-platform-daemons that referenced this pull request Nov 30, 2024
When LC is absent for 30 minutes, the database cleanup kicks in. When LagId is released, it needs to be appended to the SYSTEM_LAG_IDS_FREE_LIST

This PR works with the following 2 PRs:
sonic-net/sonic-swss#3303
sonic-net/sonic-buildimage#20369

Signed-off-by: mlok <marty.lok@nokia.com>
mssonicbld pushed a commit that referenced this pull request Dec 1, 2024
…SIS_APP_DB (#20369)

Modify database.sh to create a initial SYSTEM_LAG_IDS_FREE_LIST in the CHASSIS_APP_DB on SUP during database-chassis startup
Modify the database consistency check in swss.sh to append the lagid to the end of SYSTEM_LAG_IDS_FREE_LIST when lagid is released.
Modify the lag_id_end=1023 (not 1024) in chassisdb.conf since BCM supports the large lagid is 1023

Signed-off-by: mlok <marty.lok@nokia.com>
arista-nwolfe added a commit to arista-nwolfe/sonic-buildimage that referenced this pull request Dec 9, 2024
rlhui pushed a commit that referenced this pull request Dec 10, 2024
The following PRs made 1024 incorrect:
#20369
sonic-net/sonic-swss#3303

This fixes:
#21096
mssonicbld pushed a commit to mssonicbld/sonic-buildimage that referenced this pull request Dec 10, 2024
mssonicbld pushed a commit to mssonicbld/sonic-buildimage that referenced this pull request Dec 10, 2024
mssonicbld pushed a commit that referenced this pull request Dec 11, 2024
arlakshm pushed a commit to sonic-net/sonic-mgmt that referenced this pull request Dec 21, 2024
… Due to Lag ID Set Changes (#16116)

What is the motivation for this PR?
Changes by the Functionality sonic-net/sonic-buildimage#20369 of Lag ID assignment.
The TC Expectation and assertion is changed

How did you do it?
Ignoring the SYSTEM_LAG_ID_SET to be same as in case pre-dump.
But rather be assigned from SYSTEM_LAG_IDS_FREE_LIST in order.
Added a sanity of lag_id_set to ensure the functionality of PR sonic-net/sonic-buildimage#20369.

How did you verify/test it?
Tested and Verified on a T2 VOQ Chassis
vvolam pushed a commit to vvolam/sonic-platform-daemons that referenced this pull request Jan 3, 2025
…net#542)

When LC is absent for 30 minutes, the database cleanup kicks in. When LagId is released, it needs to be appended to the SYSTEM_LAG_IDS_FREE_LIST

This PR works with the following 2 PRs:
sonic-net/sonic-swss#3303
sonic-net/sonic-buildimage#20369

Signed-off-by: mlok <marty.lok@nokia.com>
mssonicbld added a commit that referenced this pull request Jan 16, 2025
VladimirKuk pushed a commit to Marvell-switching/sonic-buildimage that referenced this pull request Jan 21, 2025
…SIS_APP_DB (sonic-net#20369)

Modify database.sh to create a initial SYSTEM_LAG_IDS_FREE_LIST in the CHASSIS_APP_DB on SUP during database-chassis startup
Modify the database consistency check in swss.sh to append the lagid to the end of SYSTEM_LAG_IDS_FREE_LIST when lagid is released.
Modify the lag_id_end=1023 (not 1024) in chassisdb.conf since BCM supports the large lagid is 1023

Signed-off-by: mlok <marty.lok@nokia.com>
VladimirKuk pushed a commit to Marvell-switching/sonic-buildimage that referenced this pull request Jan 21, 2025
Javier-Tan pushed a commit to Azure/sonic-mgmt.msft that referenced this pull request Jan 24, 2025
… Due to Lag ID Set Changes (#16116)

What is the motivation for this PR?
Changes by the Functionality sonic-net/sonic-buildimage#20369 of Lag ID assignment.
The TC Expectation and assertion is changed

How did you do it?
Ignoring the SYSTEM_LAG_ID_SET to be same as in case pre-dump.
But rather be assigned from SYSTEM_LAG_IDS_FREE_LIST in order.
Added a sanity of lag_id_set to ensure the functionality of PR sonic-net/sonic-buildimage#20369.

How did you verify/test it?
Tested and Verified on a T2 VOQ Chassis
prgeor pushed a commit to sonic-net/sonic-platform-daemons that referenced this pull request Feb 6, 2025
…evice is in detaching mode (#546)

* Skip logging the warning, if device is in detaching mode

* Add detach_info table and unittests

* Fix unit tests

* Increase code coverage

* Remove unused header import

* Fix dict get values

* Increase code coverage

* Increase test coverage

* [SmartSwitch] Extend implementation of the DPU chassis daemon. (#563)

* Addition of DPU Chassis for thermalctld (#564)

* [stormond] Added new dynamic field 'last_sync_time' to STATE_DB (#535)

* Added new dynamic field 'last_sync_time' that shows when STORAGE_INFO for disk was last synced to STATE_DB

* Moved 'start' message to actual starting point of the daemon

* Added functions for formatted and epoch time for user friendly time display

* Made changes per prgeor review comments

* Pivot to SysLogger for all logging

* Increased log level so that they are seen in syslogs

* Code coverage improvement

* [lag_id] Add lagid to free_list when LC absent for 30 minutes (#542)

When LC is absent for 30 minutes, the database cleanup kicks in. When LagId is released, it needs to be appended to the SYSTEM_LAG_IDS_FREE_LIST

This PR works with the following 2 PRs:
sonic-net/sonic-swss#3303
sonic-net/sonic-buildimage#20369

Signed-off-by: mlok <marty.lok@nokia.com>

* Fixed bug in chassisd causing incorrect number of ASICs in CHASSIS_STATE_DB (#560)

Fixed the bug in chassisd due to which incorrect number of ASICs were being pushed to CHASSIS_STATE_DB.

* thermalctld: Add support for fans on non-CPU modules (#555)

* thermalctld: Add support for fans on non-CPU modules

* Add module fan to unit tests

* Advanced Azure pipeline to Bookworm (#572)

Description
This PR advances the azure pipeline on sonic_platform_daemons from bullseye to bookworm. This fixes the issue where sonic-platform-daemons azp is having some issues due to upgrade to bookworm. See Pipelines - Run 20241210.8 logs for details.

* Take non-CMIS xcvrs out of lpmode in SFF Manager (#565)

Description
Fix non-CMIS transceivers in down state by bringing them out of low power mode in the SFF Manager Task.
This is intended to work together with the change in sonic-net/sonic-buildimage#20886.

Motivation and Context
Non-CMIS transceivers were not functioning correctly when put into Low Power mode. So XCVRD now brings them out of lpmode.

How Has This Been Tested?
Loaded an image containing this change alongside the change from sonic-net/sonic-buildimage#20886 on an Arista chassis containing a Clearwater2 linecard.
Verified that without this image some interfaces were in a down state but with the image all interfaces came up as expected.

* Added SmartSwitch support in chassisd and enabling chassisd  (#467)

Added SmartSwitch support in chassisd and enabling chassisd

* [chassis][psud] Move the PSU parent information generation to the loop run function from the initialization function (#576)

Description
Move the PSU parent information generation to the loop run function from the initialization function

Motivation and Context
Fixes #575

How Has This Been Tested?
Tested on Cisco chassis, the PHYSICAL_ENTITY_INFO|PSU * can be re-inserted after thermalctld restart.
And monitored the stated db for memory for hours, works well:

* [chassisd] Address the chassisd crash issue and add UT for it (#573)

Description
On Nokia platform, slot name of Supervisor is string "A" instead of a number. Using "int" to convert it could cause issue backtrace. We should use slot value to any checking without any conversion. This will fixes sonic-net/sonic-buildimage#21131

Motivation and Context
Modify the _get_module_info not to convert "slot" to a string value. And also modify the code not to convert slot value to an to do any checking. Just directly use the returned value of get_slot(). Also add UT test_moduleupdater_check_slot_string() to valid it.

How Has This Been Tested?
Tested on 202405 branch


Signed-off-by: mlok <marty.lok@nokia.com>

* Fix a comment

---------

Signed-off-by: mlok <marty.lok@nokia.com>
Co-authored-by: Oleksandr Ivantsiv <oivantsiv@nvidia.com>
Co-authored-by: Gagan Punathil Ellath <gpunathilell@nvidia.com>
Co-authored-by: Ashwin Srinivasan <93744978+assrinivasan@users.noreply.github.com>
Co-authored-by: Marty Y. Lok <76118573+mlok-nokia@users.noreply.github.com>
Co-authored-by: Vivek Verma <137406113+vivekverma-arista@users.noreply.github.com>
Co-authored-by: Patrick MacArthur <pmacarthur@arista.com>
Co-authored-by: Peter Bailey <peterbailey@arista.com>
Co-authored-by: rameshraghupathy <43161235+rameshraghupathy@users.noreply.github.com>
Co-authored-by: Jianquan Ye <jianquanye@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

5 participants