Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Mellanox] Always restart thermalctld on Mellanox platform when it exits #24

Closed
wants to merge 1 commit into from

Conversation

Junchao-Mellanox
Copy link
Owner

@Junchao-Mellanox Junchao-Mellanox commented Sep 15, 2020

- Why I did it

On mellanox paltform, part of thermalctld function is to handle user space thermal policies for events like fan/PSU removing, it works together with kernel thermal algorithm to make sure the switch won't be overheat.

Recently, we found that commit sonic-net@cbc75fe changes its autorestart configuration in supervisord, and it won't be auto restarted after being killed. This PR is to make sure that thermalctld will be always restarted on mellanox platform when it is killed.

- How I did it

  1. Add a variable "always_restart_thermalctld" in pmon_daemon_control.json
  2. In docker-pmon.supervisord.conf.j2, it checks variable "always_restart_thermalctld" and set autorestart configuration for thermalctld accordingly.

- How to verify it

Manual test

- Which release branch to backport (provide reason below if selected)

Depends on where sonic-net@cbc75fe is going to merge to.

  • 201811
  • 201911
  • 202006

- Description for the changelog

- A picture of a cute animal (not mandatory but encouraged)

@Junchao-Mellanox Junchao-Mellanox changed the title Always restart thermalctld on Mellanox platform when it exits [Mellanox] Always restart thermalctld on Mellanox platform when it exits Sep 15, 2020
@Junchao-Mellanox
Copy link
Owner Author

sonic-net#5375

@Junchao-Mellanox Junchao-Mellanox deleted the restart_thermalctld_for_mlnx branch December 15, 2020 01:42
Junchao-Mellanox pushed a commit that referenced this pull request Jul 1, 2021
Advance submodule update with the following changes:
4475750 Config reload fix (#29)
cf60d5e [ci]: add proper azp (#26)
f0fbfe7 [CI] Set up CI with Azure Pipelines (#25)
879d7bd Include port default fec configuration to be included in ZTP configuration (#24)
a6ae955 Add a pre-defined plugin to download a list of files (#23)
6f0305b [MultiDB] Add multidb support to sonic-ztp (#16)
Junchao-Mellanox pushed a commit that referenced this pull request Mar 9, 2022
ce72b0d Longxiang Lyu Thu Feb 24 06:05:12 2022 Put handler member functions as virtual in base (#30)
ef59e4f Jing Zhang Fri Feb 25 11:38:28 2022 Incrementing tolerance on mux state inconsistency (#27)
2d12892 Longxiang Lyu Wed Feb 16 03:32:06 2022 Rename LinkManagerStateMachine to ActiveStandbyStateMachine (#26)
f38634c Jing Zhang Thu Feb 17 17:23:56 2022 Update log level for mux probing and mux state chance (#23)
a8434dd Jing Zhang Thu Feb 17 17:21:01 2022 Handle xcvrd crashing scenarios (#22)
2ebdb2b Longxiang Lyu Mon Feb 14 13:26:07 2022 [make] Enable make extra includes (#24)
Junchao-Mellanox pushed a commit that referenced this pull request Mar 14, 2022
Changes:

Update submodule branch to 202012
[sonic-linkmgrd][202012] submodule update

a8ddff5 Jing Zhang Fri Feb 25 11:38:28 2022 Incrementing tolerance on mux state inconsistency (#27)
a3f78a3 Jing Zhang Thu Feb 17 17:23:56 2022 Update log level for mux probing and mux state chance (#23)
05156fb Jing Zhang Thu Feb 17 17:21:01 2022 Handle xcvrd crashing scenarios (#22)
74529ef Longxiang Lyu Mon Feb 14 13:26:07 2022 [make] Enable make extra includes (#24)

sign-off: Jing Zhang zhangjing@microsoft.com
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants