Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[chassis][pmon][chassid] Enhance the chassid module on-line or off-line log messages with physical slot number #530

Merged
merged 2 commits into from
Aug 15, 2024

Conversation

mlok-nokia
Copy link
Contributor

Description

Per MSFT request, we enhance the module on-line and off-line messages with physical slot number

Before enhance:
pmon#chassisd: Module LINE-CARD0 is on-line!
pmon#chassisd: Module LINE-CARD0 went off-line!
pmon#chassisd: Module LINE-CARD0 recovered on-line!
pmon#chassisd: Module FABRIC-CARD6 is on-line!

After Enhance:
pmon#chassisd: Module LINE-CARD0 (Slot 1) is on-line!
pmon#chassisd: Module LINE-CARD0 (Slot 1) went off-line!
pmon#chassisd: Module LINE-CARD0 (Slot 1) recovered on-line!
pmon#chassisd: Module FABRIC-CARD6 (Slot 7) is on-line!

This change is required by 202205 branch also.

Motivation and Context

Per MSFT's request -- In order to easily identify a module within a chassis, we need to report the physical slot number whenever we raised “module” related syslogs.

How Has This Been Tested?

Tested with Master branch and 202205 branch.

Additional Information (Optional)

…ne with physical slot num

Signed-off-by: mlok <marty.lok@nokia.com>
@mlok-nokia
Copy link
Contributor Author

@gechiang @judyjoseph This PR enhances the chassis module on-line and off-line message with physical slot info. Please review it. Thanks

@mlok-nokia
Copy link
Contributor Author

mlok-nokia commented Aug 7, 2024

@gechiang @judyjoseph I just found this PR #480 have not be merged to 202205 branch while other PRs which are associated with PR #480 have been merged to 202205. Please merge this PRs first.

@gechiang
Copy link
Contributor

gechiang commented Aug 7, 2024

@gechiang @judyjoseph I just found this PR #480 have not be merged to 202205 branch while other PRs which are associated with PR #480 have been merged to 202205. Please merge this PRs first.

@mlok-nokia the PR you quoted is no longer allowed for public 202205 branch for this submodule as it is a feature enhancement and not a bug fix and 202205 release owner stopped accepting changes. We do have internal build that takes it in as a patch already so we are good.

Copy link
Contributor

@gechiang gechiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also add the same change for the 4 syslog related to Midplane connectivity issues at line 455, 457, 459, and 465.
Also on line 556 for module down long time please also add the slot info as well.
Thanks!

@mlok-nokia
Copy link
Contributor Author

mlok-nokia commented Aug 7, 2024

Can you also add the same change for the 4 syslog related to Midplane connectivity issues at line 455, 457, 459, and 465. Also on line 556 for module down long time please also add the slot info as well. Thanks!

@gechiang I just made the change and tested it. Please review it again.

@mlok-nokia
Copy link
Contributor Author

@gechiang @judyjoseph I just found this PR #480 have not be merged to 202205 branch while other PRs which are associated with PR #480 have been merged to 202205. Please merge this PRs first.

@mlok-nokia the PR you quoted is no longer allowed for public 202205 branch for this submodule as it is a feature enhancement and not a bug fix and 202205 release owner stopped accepting changes. We do have internal build that takes it in as a patch already so we are good.

Ok. Thanks.

@mlok-nokia
Copy link
Contributor Author

@gechiang Do we need this PR for 202205 branch?

@rlhui
Copy link

rlhui commented Aug 7, 2024

can we also have corresponding test for this?

@mlok-nokia
Copy link
Contributor Author

can we also have corresponding test for this?

There are already testcases. But not for checking logging message.

@gechiang
Copy link
Contributor

gechiang commented Aug 7, 2024

@gechiang Do we need this PR for 202205 branch?

yes. but we will do it internally as officially public 202205 is closed... So once we have this merged, we will pick it up in our next internal build. For community who also needs this for testing on 202205 MSFT repo will also have to manually patch it in their own internal 202205 based builds...

Copy link
Contributor

@gechiang gechiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
Thanks!

@arlakshm
Copy link
Contributor

@kenneth-arista for viz..

@arlakshm
Copy link
Contributor

can we add some unit test?

@mlok-nokia
Copy link
Contributor Author

mlok-nokia commented Aug 12, 2024

can we add some unit test?
There are UTs for this section of code. But it is just not able check the log texts. For this module, adding any new line of code requires to add UT for it. Otherwise, it will fail to UT verification during image build.

@gechiang
Copy link
Contributor

All comments seem to have been answered already. Any other comments? @kenneth-arista any comments?
Thanks!

@rlhui rlhui added the P0 label Aug 14, 2024
Copy link

@kenneth-arista kenneth-arista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine to me to add slot info to the syslog messages.

@gechiang gechiang requested a review from rlhui August 14, 2024 20:47
@gechiang
Copy link
Contributor

@arlakshm , please help sign off on this PR.
@rlhui , please help merge this PR.
Thanks!

@judyjoseph
Copy link
Contributor

@mlok-nokia a general comment -- there are a few more places in the chassisd script where we just have the Module {} alone, should we add Slot {} there too for uniform logs -- not critical but @gechiang f.y.i

@rlhui rlhui merged commit b3189e3 into sonic-net:master Aug 15, 2024
5 checks passed
@mlok-nokia
Copy link
Contributor Author

@mlok-nokia a general comment -- there are a few more places in the chassisd script where we just have the Module {} alone, should we add Slot {} there too for uniform logs -- not critical but @gechiang f.y.i

It seems like there are only two messages which contains Module {}. I will take a look if it is feasible to add the slot for since one of them complains Module {} with hostname while another is logging for CLI command "config chassis module shutdown "

mssonicbld pushed a commit to mssonicbld/sonic-platform-daemons that referenced this pull request Aug 23, 2024
…ne log messages with physical slot number (sonic-net#530)

* [chassis][pmon][chassid] Enhance the chassid module on-line or off-line with physical slot num

---------

Signed-off-by: mlok <marty.lok@nokia.com>
@mssonicbld
Copy link
Collaborator

Cherry-pick PR to 202405: #534

mssonicbld pushed a commit that referenced this pull request Aug 24, 2024
…ne log messages with physical slot number (#530)

* [chassis][pmon][chassid] Enhance the chassid module on-line or off-line with physical slot num

---------

Signed-off-by: mlok <marty.lok@nokia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

8 participants