-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Surface fan module #144
Surface fan module #144
Conversation
Thanks for figuring all that out! I don't have time right now to look into this in detail, but I'll try to do that on the weekend. I'll also try to check if this works on my SB2, but I doubt that the interface exists there since the FAN0/PNP0C0B device is not present on that. Regarding your implementation: For a proper upstreamable module, we will need to figure something out to avoid the Apart from that, I have one very small comment on your code, but I only had a quick glance yet. Again, I'll try to have a look on the weekend and maybe see if I can figure out why it sometimes won't set up correctly. |
Absolutely no rush! I'd really like to control the fan above the 40 degrees point, so I'm going to be exploring some more around that, but it may involve figuring out how to get IRPMon to log during boot, I may just try to hack up that driver and hardcode it for that uart bus. I'm not a Windows developer though so I expect that'll take some time.
Oh interesting... I expected them to have a rule that is something like; if there is a resource in a common bus, the custom driver should still claim that. If we don't want the clash, I think we can easily remove the ACPI bus stuff, in which case I think we can make this into a ssam client driver instead (also, kudo's for the docs!).
👍 no pressure, I'll try to change it to not use the ACPI bus anymore. |
Ok, well, I didn't quite get this working yet. I feel the setup should be very similar to the kernel/drivers/platform/surface/surface_fan.c Lines 242 to 257 in 593ced7
But this seems to never enter the probe, current commit 593ced7 is this
I did a brief survey of strings in some of the |
Unfortunately, that's kind of a thing with the SAM devices. None of them are really auto-discoverable (at least reliably). You need to add a corresponding entry to the device registry. Specifically, you need to add a new
Urgh... that's not a good sign. Essentially, you are correct: PEPs allow overriding/adding ACPI functions with custom drivers. And there currently is no support for this on Linux. (It's a mechanism that Qualcomm uses heavily for their SoCs on Windows and one of the reasons those devices generally don't work out of the box with Linux, and need quite a lot of work). So based on this, I assume the SAM driver is overriding/extending the ACPI spec for the generic fan somehow. So at least it could contain some useful calls. What file did you find the strings in? |
Ah I see! Thanks for that pointer! Can't seem to be able to load that module with an out of tree build as it has some unknown symbol, so I'm building a new kernel now with modifications 🙄 🤞
I tried to find something to look at the available ACPI methods on Windows, but couldn't find anything useful yet. I was hoping there was an 'enable/disable' method accessible through it or something, or even a 'reinit' that we could trigger with irpmon running. But so far haven't even found anything for Windows that allows me to list the acpi devices.
I downloaded the SurfacePro9_Win11_22621_23.103.34762.0.msi file, then used |
It works! 7cbc1fa did exactly what you recommended, after that the fan module correctly registers using the |
I assume it complained while running Also: Nice work! |
Yep, error occurred when running |
Brief end of day update, finally some progress worth sharing. The past few days I've been trying in various ways to get more information from the Windows side of things. The TL;DR of failed attempts:
Like I said in the original description, irpmon claims it can log during boot, but starting it and then rebooting didn't do it for me. I expected nothing told the driver what to trace and was considering forking it and just hardcoding what to record. I first used the procedure described here to get a boot trace to confirm irpmon comes up before the things we're interested in: This showed me that the After some reading in the source code I discovered that it can read settings at boot from the registry. Then, reading the docs on the commandline do point out that flag on the So, the steps to log the UART2 bus with irpmon during windows' boot are the following, use the gui (as administrator) to ensure the driver is setup to start at boot. To enable run the following, written logs go to
Writing happens in chunks, current log may be zero bytes until flushed, or just disable it and reboot. To disable, run:
From the first boot log (50 mb
So But I'm pretty sure something breaks in the
And there's an enormous blob of data in one command... (json is 5 mb);
That payload does contain the magic |
That is very much possible. Maybe I should try and rewrite it from scratch at this point... If I remember correctly, irpmon sometimes also dropped some data however, which didn't make this easier. Based on your description, I assume that it misinterprets/reads the wrong payload length of some command, causing it to |
Yeah, something like that, I saw more than
Those were all seen in the bootlog in the I should have time tomorrow to look into this in more depth. |
Ok, so this is a bit of a let down, but I tried replaying all commands that I recorded as being sent to the fan system when booting Windows. This script, has both the recorded commands as well as the code that should replay them. Unfortunately, I don't achieve the fan control that Windows has above 30C. I wonder if the actual command to enable the override is not sent to the fan category but instead sent to another category. Or perhaps the values we have to send to CID 8 need to change each time. I'm at a loss 😞. I'll probably let this (overriding the fan control) rest for now, there's some other things I want to shift my focus to. We can merge this as is, leave it open in the hopes someone figures it out, or remove the 'set' functionality. |
I guess there could be some subsystem-interplay at work, as you suggest. I assume you played around with the platform profiles and checked whether it behaves differently for e.g. the best-performance one?
Sure, no worries.
I would vote for removing or guarding the |
Yes, that always works 1
Yeah, I tried playing around with many things, including this. This does make me wonder about my original statement;
I'm going to do a more rigorous analysis of this, that should allow us to conclusively state if the fan's speed or ramp is affected by the fan profile as sent down. Footnotes
|
The SAM 33/34 CIDs are D0 exit/entry commands (the request parameters we know are documented here) and generally sent when the device enters or exits some kind of sleep state. I'm not entirely sure what those do, but 15/16 are similar commands and silence SAM/EC events (at least on my SB2). So quite likely something is shut down or silenced on the EC with those as well. So I would say that that part is expected. |
Ok, well, clear difference when doing a properly set-up test.
The log_sensors.py script is used to record data, it also contains capture of the irp log from WIndows when switching between the three profiles, from that we obtain:
Plots with this script. Test 1, fan best
Recorded data sensors_log_platform_profile_performance_fan_best_2023_12_10__13_33.json.gz. I cancelled the
Test2, fan normalAfter roughly a 30 minute cooldown period.
Recorded data; platform_profile_performance_fan_normal_2023_12_10__14_30.json.gz. I cancelled the stress invocation later here, because I was more like; well, it should be able to handle this. Windowsedit; Well, I don't know how to get hard numbers on windows really. In best performance mode, using cpustress with 12 workers at 100% utilisation does not result in the fan going at max RPM in a test duration of 15 minutes. Rebooting from this hot state to linux I did not notice a change in fan speed, fan was going at about 5500 and decreasing. I expect Windows throttles the cpu to stay below the firmware's overheat handling? ComparisonIf we overlay the two fan profiles: We see that the fan: best profile has a much higher plateau, and the first 'overheat event' where the fan spikes to max and the CPU gets drastically throttled is later in the recording. We should probably add sending this to the fan module when we switch profiles from the platform profile module? |
I went with this for now, in 99e90e7 that is implemented, with this as a parameter is allows for easy experimentation. In hwmon we can hide the
For the cooling device the best we can do is giving a permission error I think:
I considered completely removing the registration for the thermal cooling device, but the ACPI fan is also in there and non functional, so it probably doesn't hurt. This way the fan can still be inspected through |
Ok, I'm back with more analysis, I did a bunch of tests with IRPMon logging running on windows. I think Windows does not command the fan directly, it merely sets the fan profile and that makes it appear to do a better job than linux. Basically, I did three tests:
Plotting the FAN rpms for these measurements, and the original two from linux: The fan rise curve is of identical slope in Windows when compare to Linux if the fan is in the 'performance mode / best' performance profile. From the battery test, we only depict the section on AC power here. Markers on the lines denote a message with that information from the IRPMon recording, so relatively sparse. So, to my surprise, neither of the AC-only tests had CID 11 in it. The test that started on the battery did have that command in it, so I did a more thorough analysis of that. Here's an annotated graph of that recording, including the section on battery power:
Based on these findings, I propose the following:
That should make it clear there's only inspection functionality to all systems and users. It's unfortunate we can't offer more control, but by setting the fan profile we should at least be on par with Windows' cooling capacity. |
Nice work!
AFAIK those values are just divided by 100 to get °C. On older devices, those are integrated into ACPI via the SAN driver (directly as I think your plan sounds good. There's the question of how to deal with the platform profile integration though. In particular dealing with the difference between devices that have a fan and ones that do not (ideally, I don't want to rely on querying that by just seeing if the command fails). We might have to handle that via some device property and set that in the registry. |
Oh nice! I may circle back to this when I get around to making a
The last four commits implement this, the module is now very lean. I updated the
Yes, I like not performing the request if we don't know if we have a fan 👍 Took a bit of a cursory look, I don't have a full grasp of how the systems all interact, so I may be totally of the mark here. Initially I thought in But then I read some more and looked at way things are defined in the surface registry, like the platform profile, that |
For upstreaming, running |
Correct. That looks like the most reliable option. I'd suggest And as @StollD said: Running |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple of smaller comments.
One thing I also want to mention: In the end (like right at the end after all changes are done/reviewed) it would be great if you could squash the changes into two commits: One for the registry stuff and one for the actual driver + KConfig etc. (with proper upstream-conform commit messages). That's then more or less the package that you can submit upstream.
6f27668 Incorporates changes proposed by
Thanks for the tip! |
Happy to squash and split them out when we're done iterating, until then I'll keep the commits separate for a clear changelog & review. |
Add an entry for the fan speed function. Add this new entry to the Surface Pro 9 group. Signed-off-by: Ivor Wanders <ivor@iwanders.net> Link: linux-surface/kernel#144 Reviewed-by: Maximilian Luz <luzmaximilian@gmail.com> Patchset: surface-sam
Adds a driver that provides read only access to the fan speed for Microsoft Surface Pro devices. The fan speed is always regulated by the EC and cannot be influenced directly. Signed-off-by: Ivor Wanders <ivor@iwanders.net> Link: linux-surface/kernel#144 Patchset: surface-sam
Add an entry for the fan speed function. Add this new entry to the Surface Pro 9 group. Signed-off-by: Ivor Wanders <ivor@iwanders.net> Link: linux-surface/kernel#144 Reviewed-by: Maximilian Luz <luzmaximilian@gmail.com> Patchset: surface-sam
Adds a driver that provides read only access to the fan speed for Microsoft Surface Pro devices. The fan speed is always regulated by the EC and cannot be influenced directly. Signed-off-by: Ivor Wanders <ivor@iwanders.net> Link: linux-surface/kernel#144 Patchset: surface-sam
Add an entry for the fan speed function. Add this new entry to the Surface Pro 9 group. Signed-off-by: Ivor Wanders <ivor@iwanders.net> Link: linux-surface/kernel#144 Reviewed-by: Maximilian Luz <luzmaximilian@gmail.com> Patchset: surface-sam
Adds a driver that provides read only access to the fan speed for Microsoft Surface Pro devices. The fan speed is always regulated by the EC and cannot be influenced directly. Signed-off-by: Ivor Wanders <ivor@iwanders.net> Link: linux-surface/kernel#144 Patchset: surface-sam
Add an entry for the fan speed function. Add this new entry to the Surface Pro 9 group. Signed-off-by: Ivor Wanders <ivor@iwanders.net> Link: linux-surface/kernel#144 Reviewed-by: Maximilian Luz <luzmaximilian@gmail.com> Patchset: surface-sam
Adds a driver that provides read only access to the fan speed for Microsoft Surface Pro devices. The fan speed is always regulated by the EC and cannot be influenced directly. Signed-off-by: Ivor Wanders <ivor@iwanders.net> Link: linux-surface/kernel#144 Patchset: surface-sam
Add an entry for the fan speed function. Add this new entry to the Surface Pro 9 group. Signed-off-by: Ivor Wanders <ivor@iwanders.net> Link: linux-surface/kernel#144 Reviewed-by: Maximilian Luz <luzmaximilian@gmail.com> Patchset: surface-sam
Adds a driver that provides read only access to the fan speed for Microsoft Surface Pro devices. The fan speed is always regulated by the EC and cannot be influenced directly. Signed-off-by: Ivor Wanders <ivor@iwanders.net> Link: linux-surface/kernel#144 Patchset: surface-sam
Add an entry for the fan speed function. Add this new entry to the Surface Pro 9 group. Signed-off-by: Ivor Wanders <ivor@iwanders.net> Link: linux-surface/kernel#144 Reviewed-by: Maximilian Luz <luzmaximilian@gmail.com> Patchset: surface-sam
Adds a driver that provides read only access to the fan speed for Microsoft Surface Pro devices. The fan speed is always regulated by the EC and cannot be influenced directly. Signed-off-by: Ivor Wanders <ivor@iwanders.net> Link: linux-surface/kernel#144 Patchset: surface-sam
Add an entry for the fan speed function. Add this new entry to the Surface Pro 9 group. Signed-off-by: Ivor Wanders <ivor@iwanders.net> Link: linux-surface/kernel#144 Reviewed-by: Maximilian Luz <luzmaximilian@gmail.com> Patchset: surface-sam
Adds a driver that provides read only access to the fan speed for Microsoft Surface Pro devices. The fan speed is always regulated by the EC and cannot be influenced directly. Signed-off-by: Ivor Wanders <ivor@iwanders.net> Link: linux-surface/kernel#144 Patchset: surface-sam
Add an entry for the fan speed function. Add this new entry to the Surface Pro 9 group. Signed-off-by: Ivor Wanders <ivor@iwanders.net> Link: linux-surface/kernel#144 Reviewed-by: Maximilian Luz <luzmaximilian@gmail.com> Patchset: surface-sam
Adds a driver that provides read only access to the fan speed for Microsoft Surface Pro devices. The fan speed is always regulated by the EC and cannot be influenced directly. Signed-off-by: Ivor Wanders <ivor@iwanders.net> Link: linux-surface/kernel#144 Patchset: surface-sam
Add an entry for the fan speed function. Add this new entry to the Surface Pro 9 group. Signed-off-by: Ivor Wanders <ivor@iwanders.net> Link: linux-surface/kernel#144 Reviewed-by: Maximilian Luz <luzmaximilian@gmail.com> Patchset: surface-sam
Adds a driver that provides read only access to the fan speed for Microsoft Surface Pro devices. The fan speed is always regulated by the EC and cannot be influenced directly. Signed-off-by: Ivor Wanders <ivor@iwanders.net> Link: linux-surface/kernel#144 Patchset: surface-sam
Add an entry for the fan speed function. Add this new entry to the Surface Pro 9 group. Signed-off-by: Ivor Wanders <ivor@iwanders.net> Link: linux-surface/kernel#144 Reviewed-by: Maximilian Luz <luzmaximilian@gmail.com> Patchset: surface-sam
Adds a driver that provides read only access to the fan speed for Microsoft Surface Pro devices. The fan speed is always regulated by the EC and cannot be influenced directly. Signed-off-by: Ivor Wanders <ivor@iwanders.net> Link: linux-surface/kernel#144 Patchset: surface-sam
Add an entry for the fan speed function. Add this new entry to the Surface Pro 9 group. Signed-off-by: Ivor Wanders <ivor@iwanders.net> Link: linux-surface/kernel#144 Reviewed-by: Maximilian Luz <luzmaximilian@gmail.com> Patchset: surface-sam
Adds a driver that provides read only access to the fan speed for Microsoft Surface Pro devices. The fan speed is always regulated by the EC and cannot be influenced directly. Signed-off-by: Ivor Wanders <ivor@iwanders.net> Link: linux-surface/kernel#144 Patchset: surface-sam
Add an entry for the fan speed function. Add this new entry to the Surface Pro 9 group. Signed-off-by: Ivor Wanders <ivor@iwanders.net> Link: linux-surface/kernel#144 Reviewed-by: Maximilian Luz <luzmaximilian@gmail.com> Patchset: surface-sam
Adds a driver that provides read only access to the fan speed for Microsoft Surface Pro devices. The fan speed is always regulated by the EC and cannot be influenced directly. Signed-off-by: Ivor Wanders <ivor@iwanders.net> Link: linux-surface/kernel#144 Patchset: surface-sam
Add an entry for the fan speed function. Add this new entry to the Surface Pro 9 group. Signed-off-by: Ivor Wanders <ivor@iwanders.net> Link: linux-surface/kernel#144 Reviewed-by: Maximilian Luz <luzmaximilian@gmail.com> Patchset: surface-sam
Adds a driver that provides read only access to the fan speed for Microsoft Surface Pro devices. The fan speed is always regulated by the EC and cannot be influenced directly. Signed-off-by: Ivor Wanders <ivor@iwanders.net> Link: linux-surface/kernel#144 Patchset: surface-sam
Add an entry for the fan speed function. Add this new entry to the Surface Pro 9 group. Signed-off-by: Ivor Wanders <ivor@iwanders.net> Link: linux-surface/kernel#144 Reviewed-by: Maximilian Luz <luzmaximilian@gmail.com> Patchset: surface-sam
Adds a driver that provides read only access to the fan speed for Microsoft Surface Pro devices. The fan speed is always regulated by the EC and cannot be influenced directly. Signed-off-by: Ivor Wanders <ivor@iwanders.net> Link: linux-surface/kernel#144 Patchset: surface-sam
Add an entry for the fan speed function. Add this new entry to the Surface Pro 9 group. Signed-off-by: Ivor Wanders <ivor@iwanders.net> Link: linux-surface/kernel#144 Reviewed-by: Maximilian Luz <luzmaximilian@gmail.com> Patchset: surface-sam
Adds a driver that provides read only access to the fan speed for Microsoft Surface Pro devices. The fan speed is always regulated by the EC and cannot be influenced directly. Signed-off-by: Ivor Wanders <ivor@iwanders.net> Link: linux-surface/kernel#144 Patchset: surface-sam
Updated Description / Summary
This pull request has quite a lengthy amount of comments with analysis, updating the description with a short recap to get people from the future up to speed:
Original description:
First of; thanks for the great work to support Linux on these devices, I wouldn't have bought a Surface had it not been for this project.
I ended up compiling a kernel on my new Surface and found that it got quite warm. After seeing a few issues about thermal and fans without real resolutions I decided to investigate this over the course of this week and weekend.
A coarse write up of my analysis can be found in this file, roughly, my current understanding is the following:
Quiet
andOverride
. Default isQuiet
and it boots with that.Override
and commands the fan with CID 11, verified by heating up the tablet on Linux and rebooting into Windows, this makes the fan speed up. Footnote [1].I tried to write a kernel module that does the following:
PNP0C0B
device, but we don't interact with the ACPI bus, this is merely to ensure systems don't think there's two fans. This conflicts with thefan
module that's loaded by default (so that has to be unloaded for this to work).The value set with CID 11 unfortunately gets overwritten as soon as the surface tablet temperature exceeds roughly 40C and the onboard controller kicks in. Big disappointment there, I think the fan profile needs to be switched, but I cannot figure out how to achieve that, I tried sending integers to various CID's to no avail. I tried to setup Irpmon to sniff the commands sent at boot, but I could not get that working. I only discovered this after I basically finished this module. But I'm not sure now how much value there is in providing the functionality to set the value at the moment. As long as the device stays below 40 degrees C we can control the fan, above that we lose control. I'm really not sure what else I can do to try to figure out how to change the profile, hopefully someone else has an idea.
About the actual PR
Footnote [1]: CID 11 captured from Windows.
Footnote [2]: CID 14 together with the platform profile
Tested on Surface Pro 9, i7-1255U, Windows 11 & NixOS 23.11 with kernel from this repo.