Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[issue] Manul fan control is flaky #25

Open
huangalang opened this issue May 18, 2023 · 8 comments
Open

[issue] Manul fan control is flaky #25

huangalang opened this issue May 18, 2023 · 8 comments

Comments

@huangalang
Copy link

hello users:

we are noticing there's a chance that set pwm value via FanSensor property will fail
and it's easy to reproduce the issue

/test log as bellow/

===================================================================
root@test-machine:/sys/bus/i2c/devices/15-002f/hwmon/hwmon4# busctl set-property xyz.openbmc_project.FanSensor /xyz/openbmc_project/control/fanpwm/fan0_pwm xyz.openbmc_project.Control.FanPwm Target t 75
root@test-machine:/sys/bus/i2c/devices/15-002f/hwmon/hwmon4# cat pwm1
75

root@test-machine:/sys/bus/i2c/devices/15-002f/hwmon/hwmon4# busctl set-property xyz.openbmc_project.FanSensor /xyz/openbmc_project/control/fanpwm/fan0_pwm xyz.openbmc_project.Control.FanPwm Target t 90
root@test-machine:/sys/bus/i2c/devices/15-002f/hwmon/hwmon4# cat pwm1
75 -> issue

it should not happen because usually resp is updated properly via following line
https://github.com/openbmc/dbus-sensors/blob/master/src/PwmSensor.cpp#L132

but we found when the issue happens , the program will match following condition checking
https://github.com/openbmc/dbus-sensors/blob/master/src/PwmSensor.cpp#L127

and when it happens , the pwm value can be passed down to sysfs attribute
does anyone come accross the same issue?

thanks

@chaul-ampere
Copy link
Contributor

I see that you seemed to be using an i2c fan controller driver. In the past, we noticed that the driver shows in /sys/*/pwm1 the current PWM that the fan is running at, instead of the PWM that we have echoed to it (two different registers for the purposes). So pwm1 in sysfs does not display right away what we just set (it changes with delay step by step, specially with large gap). This when combined with dbus-sensors fansensor mechanism of setting and checking causes an issue that when you set to Target a PWM that you have just set before the current Target value, it will fail to write to sysfs. If you echo directly to pwm1 a value a lot larger or smaller than the current pwm, and cat it and see that it increases or decreases slowly towards the goal, then we are facing the same prob.

@huangalang
Copy link
Author

@chaul-ampere

yes we are using i2c fan controller driver (MAX31790ATI)
not quite understand what you described , in our case , we haven't run into the problem setting pwm via sysfs directlybut if we set pwm via dbus property , the setting will be blocked in the condition checking in following link
https://github.com/openbmc/dbus-sensors/blob/master/src/PwmSensor.cpp#L127

it seems that our case is a different story from yours
however, thanks for your sharing , we will try to run the scenario you described and see what will happen

@chiuyikai24
Copy link

I see that you seemed to be using an i2c fan controller driver. In the past, we noticed that the driver shows in /sys/*/pwm1 the current PWM that the fan is running at, instead of the PWM that we have echoed to it (two different registers for the purposes). So pwm1 in sysfs does not display right away what we just set (it changes with delay step by step, specially with large gap). This when combined with dbus-sensors fansensor mechanism of setting and checking causes an issue that when you set to Target a PWM that you have just set before the current Target value, it will fail to write to sysfs. If you echo directly to pwm1 a value a lot larger or smaller than the current pwm, and cat it and see that it increases or decreases slowly towards the goal, then we are facing the same prob.

It appears to exhibit some similarities, and we are currently conducting continuous experimentation to determine if we can replicate the same conditions.
Additionally, I would like to inquire if you have any reference or recommended solutions for this issue?

@chaul-ampere
Copy link
Contributor

I see that you seemed to be using an i2c fan controller driver. In the past, we noticed that the driver shows in /sys/*/pwm1 the current PWM that the fan is running at, instead of the PWM that we have echoed to it (two different registers for the purposes). So pwm1 in sysfs does not display right away what we just set (it changes with delay step by step, specially with large gap). This when combined with dbus-sensors fansensor mechanism of setting and checking causes an issue that when you set to Target a PWM that you have just set before the current Target value, it will fail to write to sysfs. If you echo directly to pwm1 a value a lot larger or smaller than the current pwm, and cat it and see that it increases or decreases slowly towards the goal, then we are facing the same prob.

It appears to exhibit some similarities, and we are currently conducting continuous experimentation to determine if we can replicate the same conditions. Additionally, I would like to inquire if you have any reference or recommended solutions for this issue?

If you have the same prob as ours (we are using MAX31790 i2c fan controller), then the Target property of "/xyz/openbmc_project/control/fanpwm/x" is showing the PWM at which the fans are currently running (from sysfs pwm1). Because dbus-sensors alr has the Value property under "/xyz/openbmc_project/sensors/fan_pwm/x" paths to indicate that, we don't want Target to have the same meaning. We added a patch in the driver to change to using the other register MAX31790_REG_PWMOUT to let sysfs pwm1 show what we echoed to it, instead of MAX31790_REG_PWM_DUTY_CYCLE.

@chiuyikai24
Copy link

I see that you seemed to be using an i2c fan controller driver. In the past, we noticed that the driver shows in /sys/*/pwm1 the current PWM that the fan is running at, instead of the PWM that we have echoed to it (two different registers for the purposes). So pwm1 in sysfs does not display right away what we just set (it changes with delay step by step, specially with large gap). This when combined with dbus-sensors fansensor mechanism of setting and checking causes an issue that when you set to Target a PWM that you have just set before the current Target value, it will fail to write to sysfs. If you echo directly to pwm1 a value a lot larger or smaller than the current pwm, and cat it and see that it increases or decreases slowly towards the goal, then we are facing the same prob.

It appears to exhibit some similarities, and we are currently conducting continuous experimentation to determine if we can replicate the same conditions. Additionally, I would like to inquire if you have any reference or recommended solutions for this issue?

If you have the same prob as ours (we are using MAX31790 i2c fan controller), then the Target property of "/xyz/openbmc_project/control/fanpwm/x" is showing the PWM at which the fans are currently running (from sysfs pwm1). Because dbus-sensors alr has the Value property under "/xyz/openbmc_project/sensors/fan_pwm/x" paths to indicate that, we don't want Target to have the same meaning. We added a patch in the driver to change to using the other register MAX31790_REG_PWMOUT to let sysfs pwm1 show what we echoed to it, instead of MAX31790_REG_PWM_DUTY_CYCLE.

Thank you, it does indeed appear to be the problem as you described.
Here's the test log:

busctl set-property xyz.openbmc_project.FanSensor /xyz/openbmc_project/control/fanpwm/fan0_pwm xyz.openbmc_project.Control.FanPwm Target t 49
cat pwm1 ->185
cat pwm1 ->120
cat pwm1 -> 54
cat pwm1 -> 49

https://github.com/AspeedTech-BMC/linux/blob/aspeed-master-v5.15/drivers/hwmon/max31790.c#L110

Thank you for providing your excellent suggestion. It is indeed true that there is a delay issue in reading the register values for fan control in MAX31790. Following your advice, we were using MAX31790_REG_PWMOUT for reading, which effectively resolved this problem. We have also successfully tested it on the machine.

@chaul-ampere
Copy link
Contributor

I'm glad it helped. It's our temporary solution and we have not yet figured out what to do more about this issue or should the choice between the two regs be configurable in the device tree, so we just left it there waiting for progress from outside.

@chiuyikai24
Copy link

I'm glad it helped. It's our temporary solution and we have not yet figured out what to do more about this issue or should the choice between the two regs be configurable in the device tree, so we just left it there waiting for progress from outside.

Currently, this approach would be the best temporary solution. I believe it is more closely related to the fan control of the IC itself, but we can only wait and see if there are any external updates. Thank you once again.

@huangalang
Copy link
Author

@chaul-ampere
thanks for your advice , as you described that we need to either modify driver behaviour of setting/getting pwm value or modify dbus-sensor , fan sensor behaviour , you approach it by modifying the driver , however we are not sure
that's the right solution for us , we got to think it through

thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants