-
Notifications
You must be signed in to change notification settings - Fork 156
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
psu daemon doesn't update PSU FAN information #136
Comments
@jleveque raised this issue since it is common for all vendors. |
@keboliu, @Junchao-Mellanox: Can you please comment on this? |
Will check. |
Hi @aravindmani-1, I checked the latest master, it looks ok:
In psud, this function https://github.com/Azure/sonic-platform-daemons/blob/e6c786bd02e6a253fcf67dec895e7c24e940c788/sonic-psud/scripts/psud#L585 is going to set the PSU fan led status, could you check whether this function is called? For the led status "Updating", we might need to adjust system health code to align with this. |
Hi @Junchao-Mellanox , Currently, it is expected for Dell platforms to return None for psu fan led status. |
I am a litter bit confused. System health does not check PSU fan led status, why you need its value to be "True"? And led status should be "green", "red", or "Updating". Maybe you are talking about another field? And if you want system health to ignore the check for PSU, you can simple change the system health configuration to something like: {
"services_to_ignore": [],
"devices_to_ignore": ["psu"],
"user_defined_checkers": [],
"polling_interval": 60,
"led_color": {
"fault": "amber",
"normal": "green",
"booting": "orange_blink"
}
} |
The system health daemon checks for Fan status in the below piece of code: Whereas in psu daemon,we update the fan status to "Updating / N/A". The issue will be seen in platforms where thermalctld is not enabled. Logs:
You can see that the status us set to "Updating" and system health daemon reads this field and logs error message(PSU Fan is broken"). |
I see. You are talking about "status", not "led_status". I suppose your approach is good to me: set the presence value to False by default. |
Thanks @Junchao-Mellanox . |
- Initialize self.presence and other variables in PsuStatus dunder init to False instead of True. - Import datetime module. - Discussions related to this issue can be seen in #136
@aravindmani-1: Thank you for your help. I have merged #137. Once confirmed that the issue is resolved, please close this issue. |
@jleveque : Fix is not merged in master image yet. |
Submodule update here: sonic-net/sonic-buildimage#6352 |
…-net#137) - Initialize self.presence and other variables in PsuStatus dunder init to False instead of True. - Import datetime module. - Discussions related to this issue can be seen in sonic-net#136
…et#136) sonic-platform-base: Changes to introduce APIs for modular chassis for power-consumption and supplied HLD: sonic-net/SONiC#646 PSUd APIs for power requirement calculations get_maximum_supplied_power() - per PSU get_status_master_led() - get master psu led status. Class method. set_status_master_led() - set master psu led status. Class method. get_maximum_consumed_power(self) - per consumer API. Consumers are modules, Fans
Issue:
Steps to reproduce
Logs:
root@sonic:~# redis-cli -n 6 hgetall "FAN_INFO|PSU1 Fan"
1) "led_status"
2) "None"
To fix this issue:
root@sonic:/# redis-cli -n 6 hgetall "FAN_INFO|PSU1 Fan"
1) "presence"
2) "True"
3) "status"
4) "Updating"
5) "direction"
6) "exhaust"
7) "speed"
8) "67"
9) "timestamp"
10) "20201223 01:48:55"
11) "led_status"
12) "None"
The one more issue is that since the PSU fan status values are "Updating/ N/A".
In system health daemon, the expected values for PSU fan status is "True/False".
Thermalctld also updates the PSU Fan status value to "True/False".
So, need your input on whether we can change the PSU Fan status in psu daemon to return "True/False" instead of "Updating/ N/A".
If it is not done, then "PSU Fan is broken" error will be logged in redis-db for system health table.
The text was updated successfully, but these errors were encountered: