-
Notifications
You must be signed in to change notification settings - Fork 0
FW16 EC slows the CPU down #43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
From #41 (comment)
What is it that you are seeing? How do you know the EC is busy doing "something"? Are you looking at the EC console? |
I have a EC CCD plugged in so I see the console port output of the EC.
I am expecting that there is some IO blocking going on in the Linux kernel, and re-writing the Linux kernel code a bit might resolve the problem. |
Hey, I've noticed similar symptoms when running a different program. (A game, Albion Online, but I do use fw-fanctrl, which uses EC to apply fan curves) I noticed for me, that it only occurs in "balanced" and "performance" profiles, and switching to "power saver" suddenly makes the system recover. Can you replicate this? Maybe they're the same issue with different triggers? |
I think I have determined the cause. The kernel code in ./driver/acpi/ec.c does appear to be overly complex for what it needs to actually do and I think that complexity has made it difficult to optimize. |
Could you please elaborate how you came to this conclusion? That would be an interesting find. |
The ACPI talks to the EC using IOPORTs, Command=IOPORT 0x66, Data=IOPORT 0x62 From what I can see, slow down happens even when no ectool / HCI commands are being sent to the EC. Interestingly, it looks like there is some sort of work queue between ACPI and writing to the EC. |
One way to observe the EC slowing down the CPU is by doing:
I think these pauses are being caused by ACPI to EC comms blocking all the CPU cores. |
Actually, unplugging my laptop while following these steps caused KDE to not notice I'd unplugged it at all. |
However, after a reboot, I am unable to reproduce, so maybe that was unrelated? |
@ngraham20, I'd say it looks more like the undermentioned: 1 Footnotes |
Framework 16 AMD Ryzen 7 7840HS using Radeon 780M I've also tried this on Linux 6.12.22, 6.13.10, 6.14.1 and now 6.15-rc1 I'm not sure where to report this but I thought of you because you mentioned something about the EC slowing down the CPU I've recently received my PTM7958 from Framework and things are looking good on average. Using amdgpu_top ( 0.10.4 ), watching the temperatures and power/TDP, one thing I noticed ( 2~ weeks before PTM7958 ) was that the throttle status flags showing
and I thought this was due to the processor package overheating because of the liquid thermal issue, but after applying the PTM7958, I'm still seeing this and thought this is weird because the temperatures are well within the safe range. I rebooted and flags are gone ( as some others mentioned ), did some tests again ( PTM7958 related ) and thought "hmm, wonder what frequencies are like on battery", unplugged Framework power adapter and the PROCHOT_CPU and PROCHOT_GPU flags showed again, I'm like "huh?" I plugged the Framework power adapter back in and then another flag showed for 1 second ( EDC_CPU ) while PROCHOT_CPU and PROCHOT_GPU flags is still showing. I went away from my FW16 and it auto suspended in that time while amdgpu_top was still running, later resumed and flags are gone so I tried this numerous times in power profiles ( power save, balanced and performance ) via KDE power devil
The EDC_CPU flag though only shows when I plug ( never at unplug ) the Framework power adapter, CPU is idle and power profile set to "power save" again "power save" in KDE power devil ( communicates to power profile daemon ), in my case also turns off CPB ( Core Performance Boost ) and sets iGPU performance level to "low", in "balanced" or "performance" CPB is on and iGPU performance level is higher There was one instance ( after PTM7958 applied but has happened numerous times before PTM7958 and again, thought it was liquid thermal issue ) where the flags were showing, suspended the system overnight, resumed later next morning and regardless of the power profiles the processor frequency would not go above 3.8GHz~ and the temperatures were ranging 70-75 degree's celsius, after reboot ( suspend/resume didn't fix this ), frequencies were good again and I've not been able to trigger this again Since applying the PTM7958, I've never seen the processor temperature go above 97.8 degree's celsius ( average is 90.8-97.8, depending on ambient temperature ) degree's celsius, before with liquid thermal, it was hitting 100, sometimes 101 many times so it seems changing the power state from AC to DC or DC to AC enables these flags and after suspend/resume or reboot flags are gone, I'm guessing something is not being configured correctly in the firmware/BIOS/EC before or after AC/DC/suspend/resume/reboot? and with those flags enabled it's possible the firmware/BIOS/EC elsewhere is doing something which may eventually lead to the FTR issue ( #41 )? The FTR issue for me is low frequency ( weeks ) edit: maybe a bug with amdgpu_top, that only just occurred to me but there no open issues and I've been using this software on multiple AMD GPU's for some years and when looking at the frequencies, I use btop and htop too and see similar numbers ( btop just shows and average of all the cores ) edit 2: it's just done it again. On battery ( down to 59% ), plugged in, flags show, suspended, resumes, flags still showed and processor frequency won't go above 3.8GHz, suspended and resumed again, flags gone and frequency still won't go above 3.8GHz, this is while I'm in performance mode and temps no higher than 81 degree's celsius ( weather quite warm today ). I'm now wondering if this frequency issue is to do with battery discharging, I'll find out later and if so, maybe another issue ( BIOS or OS? ) |
Device Information
System Model or SKU
[ ] Framework Laptop 16 (AMD Ryzen™ 7040 Series)
No dGPU.
BIOS VERSION
3.0.5
Windows:
N/A
Linux:
Open a terminal and run the following command
sudo dmidecode --string bios-version
03.05
DIY Edition information
Memory: Manufacture and SKU
Kingston Fury Impact: Part Number: KF556S40-32
2x making 64GB total.
Storage: Manufacture and SKU
Model Number: WD_BLACK SN850X 1000GB
Firmware Version: 620361WD
Port/Peripheral information
Standalone Operation
Are you running your mainboard as a standalone device. Is standalone mode enabled in the BIOS?
Describe the bug
As discovered when I was working on
#41
It is possible for the EC embedded controller to slow the laptop CPU down.
This happens when the EC is busy doing something.
This Issue is a placeholder to try and investigate where the link is between the CPU and the EC. What is causing the slow down? So we can fix it so that the EC cannot slow the CPU down when one does not wish the EC to slow the CPU down.
If anyone can provide ideas of where in the Linux kernel to start looking for the link, it would be helpful.
Some user applications such as "Real Time Audio", "Digital Audio Workstation", "DAW" would benefit as they rely on low latency for audio input, real time processing and output.
Steps To Reproduce
Steps to reproduce the behavior:
Expected behavior
The EC should not slow the CPU down, unless intended to. E.g. Setting "Power Modes" from "Performance" to "Power Saver."
Screenshots
N/A
Operating System (please complete the following information):
uname -a
6.12.7 <- Mainline compiled kernel.Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered: