-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Freeze shortly after booting Linux on MSI Z690-A DDR4 WiFi #977
Comments
Have you tried building the standard Dasharo 1.1.3 binary (You can build the binary from source even if you're not a suscriptor) instead of rolling something custom like you are currently doing? https://docs.dasharo.com/unified/msi/building-manual/ |
Just checked, a standard 1.1.3 build behaves identically to both the official 1.1.1 binary and my 1.1.3 CONFIG_PAYLOAD_SEABIOS=y build. (And, since I now have an EFI build again with which to test it, the freeze also reproduces with DTS) |
memtest86 (the PassMark one, version 7.3) fully passes, FreeBSD, OpenBSD, and 9front (!) all work. NetBSD and bunnix cause the same freeze-then-reboot as Linux. I would check Windows, but I don't have access to a Windows boot USB. The last log message the NetBSD kernel prints is "uhid6 at uhidev0 reportid 252: input=63, output=63, feature=0", which (per cross-referencing with some OpenBSD kernel logs I happened to see) is the last motherboard port it's probing for, and would be followed by the case's ports. Bunnix prints some messages about loading the kernel and boot modules, clears the screen, and prints a single "[" before freezing. I'll try to see if I can bisect one of these further. |
How are you flashing it? Have you tried flashing using MSI FlashBIOS with the standard 1.1.3 Dasharo binary? Do you changed any options from default in the setup menu (Like ME disabled)? You are literally the first person in two years that reports freezing issues booting Linux, cause I don't recall any other. And there is nothing wrong on the base Hardware side. What devices you have connected? I only recall some strange major slow down or hang/freeze during POST with USB Flash Drives plugged into any of the 4 USB Ports that are on the same column on the back of the Motherboard, so avoid those. If you have a PS/2 Keyboard try disconnecting it too, since I recall some models being problematic, but PS/2 went though like two tweaking passes by the time of 1.1.3. |
Full order of flashes:
All of the Dasharo builds had the same freeze, and the MSI BIOS worked both before I flashed anything and when I reflashed it using FlashBIOS The only thing I have plugged in is the USB drive I'm booting from, and I've tried putting it in various different USB ports, both on the motherboard and on the case. On occasion I've also had my (non-PS/2) keyboard plugged in, but that's not affected the freeze Honestly the thing that really confuses me is that FreeBSD works |
Aha! Found a suspicious entry in the FreeBSD dmesg ( I didn't see the 12300T on the HCL, could there be some sort of CPU-specific issue with LAPIC initialization? |
The entire LGA 1700 lineup is composed of three different dies: Alder Lake C0 (8P + 8E), Alder Lake H0 (6P, no E), Raptor Lake B0 (8P + 16E). Different models with the same die aren't all that much different from each other, and there are examples of all of them working. |
cbmem: https://p.d2evs.net/EvYC~E~M8vDtE.txt I investigated a bit more, and found that Are you sure those are the only three dies? The 12300T is 4P + 0E - I couldn't find anything below the 12400 on the HCL, with 6P + 0E. |
I was told that there is a MADT related patch from a mere two weeks ago to fix an ESXi issue that is literally your case: Dasharo/coreboot#538 Don't ask me why you're the first one to report freezes booting Linux due to this in about two years, unless it is a recent regression and you were compiling latest dev branch or something. |
Built the dasharo-4.21 branch (which has that patch), partial success. Linux still freezes without Update: it also doesn't reboot anymore, it just stays frozen there indefinitely. Not sure what that means. On the bright side, NetBSD now boots, and the FreeBSD dmesg no longer has that MADT message. Oh, and I have no clue either why I'm the first person with this issue. It happened on the official 1.1.1 binaries, and up until you suggested looking into that patch, I'd only been compiling the 1.1.3 release commit, so it wasn't any sort of recent regression on a dev branch. The only thing left that I can think of is some sort of CPU-specific bug, but... at this point I'm kinda stumped as to what that could be. I guess maybe the next step could be to investigate why fixing the MADT caused Linux to freeze earlier? |
Telling Grub to use the stock BIOS's DSDT fixes the freeze as well. Since it might be relevant, the output of |
Freeze also happens on booting with maxcpus=7 then hotplugging the 8th core (cpu7) after boot, which makes setting up netconsole much easier. dmesg logs (with acpi.debug_level and acpi.debug_layer both set to 0xffffffff), starting from Note that this is, afaict, identical to the equivalent logs from enabling cpu6 (which doesn't cause the freeze): https://p.d2evs.net/J9vntXA0bokmn.txt Also note that the freeze only happens on enabling cpu7, irrespective of which other cores are online. (Oh, and I tried some magic sysrq keys, none of them have any effect after the freeze) |
Dasharo version
1.1.3, built locally (with SeaBIOS as the payload rather than EDK II, since I was having trouble compiling EDK II)
Also tried officially-built 1.1.1, with identical results. Would be happy to try any other builds you provide me if that'd help with debugging; I have tools for external flashing, so I'm not worried about bricking anything
Edit: Also tried a standard 1.1.3 build, with identical results
Hardware
MSI PRO Z690-A DDR4 WiFi
i3-12300T
1x 16GiB G.Skill F4-2666C19-16GIS (in slot A2)
No external GPU, tried with an RX580 as well and it didn't change anything
Full DTS logs available if you'd like them, unsure if they have any sensitive information
Symptoms
SeaBIOS/EDK II and Grub work perfectly fine, and are (as far as I can tell) stable. Around 15s (can measure more precisely if that'd be helpful; it's relatively variable in terms of how far the boot gets) after booting to Linux, the display freezes. If I leave it there for long enough, it eventually reboots.
Debugging steps I've tried so far
I've tried reflashing the stock BIOS in order to check if coreboot somehow caused hardware damage. The stock BIOS still works fine.
I've tried a couple different Linux distributions and bootloaders. Alpine, antiX, and DTS all behaved identically. Syslinux under SeaBIOS froze immediately after loading the kernel, whereas Grub (under both SeaBIOS and EDK II) froze later in the boot process.
I've tried booting with init=/bin/sh and debug on the Linux command line, and running dmesg -w before it freezes, both of which should in theory print kernel logs to the screen. No logs appeared during the freeze; I'm pretty sure it's not a kernel panic. I tried to set up netconsole, but wasn't able to get it to work.
I've tried reading cbmem logs from Grub; I wasn't able to figure out a way to exfiltrate them onto a proper computer, but the brief skim I did didn't show anything that looked egregious. If you can point me towards ways to get those logs out, I'd be happy to share them.
Edit: memtest86, FreeBSD, OpenBSD, and 9front work; NetBSD doesn't. See below for details.
Current suspicions
I'm wondering if perhaps something's up with the DRAM? I could see that leading to pseudorandom crashes, and that's the hardware component I'm least convinced other people have tested before.I could also see this being some sort of watchdog that's somehow not getting kicked properly, but I'm not familiar enough with x86_64 internals to know how to chase down that thought.I could also try booting other OSes? Given that it's failing while Linux is running, maybe trying out a BSD would lead to something different?The text was updated successfully, but these errors were encountered: