-
Notifications
You must be signed in to change notification settings - Fork 258
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
amd64 compiler bug #1111
Comments
thanks @ncruces for always the detailed explanation of what's going on. This indeed sounds like a very weird bug. WIll look into it in a day or so (wish me luck so that I could reproduce locally with my windows machine 😄 ) |
Oh one thing I would like you @ncruces to do is to run unit tests of wazero locally on your machine experiencing the bug. That would be helpful if we could find the failing test case. |
OK bad news is that I failed to reproduce on my windows machine 😞
the following is the system info:
|
one thing I can guess might be related is this issue #580 |
@ncruces could you run the following command in your Windows machine so that I can see what kind of variant your x86 processor is?
|
Wow, lots of leads to follow through. Great, thanks! I'll start with the easy ones,
And
You suspecting a CPU issue (which makes a lot more sense) got me thinking, can I run this on Linux on this machine? Bare metal is kinda hard, but I do have WSL installed, so I built the Linux binary and ran it on WSL, and: same issue. So same Windows+Linux binaries are OK in other computers, and have the same consistent bug on this computer (on Windows+WSL). More than likely this is indeed a CPU issue. I'll change the thread title to reflect that. |
CPU features above as a diff: --- ncruces 2023-02-09 10:00:00
+++ mathetake 2023-02-09 10:00:00
@@ -1,37 +1,52 @@
+ADX
AESNI
AVX
-AVXSLOW
+AVX2
+BMI1
+BMI2
CLMUL
CMOV
CMPXCHG8
CX16
ERMS
F16C
FLUSH_L1D
+FMA3
FXSR
FXSROPT
+HLE
HTT
-HYPERVISOR
-IA32_ARCH_CAP
IBPB
LAHF
+LZCNT
MD_CLEAR
MMX
+MOVBE
+MPX
NX
OSXSAVE
POPCNT
RDRAND
+RDSEED
RDTSCP
+RTM
+RTM_ALWAYS_ABORT
+SGX
SPEC_CTRL_SSBD
+SRBDS_CTRL
SSE
SSE2
SSE3
SSE4
SSE42
SSSE3
STIBP
SYSCALL
SYSEE
+VMX
X87
+XGETBV1
XSAVE
+XSAVEC
XSAVEOPT
+XSAVES |
OK, so running: git clone https://github.com/tetratelabs/wazero.git
cd wazero
make test On Git bash, with GNU make installed:
Multiple failures, but the big one seems to be related to PS: The symlink one may affect others. This might have changed recently, but symlinks used to require admin privileges on Windows, so it's expected that they may fail. |
nice, could you try this branch? https://github.com/tetratelabs/wazero/compare/fixwindows_clz_clt?expand=1 @ncruces |
Yes, that fixes the issue. The only failing tests after it are symlink ones:
And my use case works as well:
But singling out
My CPU is missing |
That condition seems flawed. |
#1112 fixes this everywhere l could test. It introduces a dependency though. The fast, safe, no deps fix is to never use |
I made another attempt on #1112 using a lighter dependency (just one function implemented in asm that, apparently, from the licence, originates within Go itself). With links to documentation for everything else. Feel free to close (#1112) at your convenience (just trying to offer guidance) and ping me to test something (since I have the affected machine). Thanks! |
Describe the bug
While developing
github.com/ncruces/go-sqlite3
I found an issue where a unit test was breaking on my windows machine (but not on GitHub CI).Basically, running the same code produces a different result on that computer, failing the test. I found that switching to the interpreter fixes the issue on that computer. On other platforms, on GitHub CI, and on a friend's Windows computer, I can't repro the bug. The bug is consistent (always the same output), and survived a computer reboot.
To Reproduce
On the affected computer:
git clone https://github.com/ncruces/wazero-sqlite3-bug.git cd wazero-sqlite3-bug go run ./cmd
Expected output:
Actual output in affected computer:
Environment (please complete the relevant information):
Additional context
The repo above runs the test in a matrix of OS versions (all the ones available on GitHub), and can't trigger the issue.
The command that runs is not actually the failing unit test. It's just an attempt at a simpler case. The tests that fail are these two. Also float to text conversions gone bad. I've inspected the code that converts float to text in SQLite and nothing in particular comes up.
Not sure what to make of this…
The text was updated successfully, but these errors were encountered: