-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Investigate non-deterministic failures with SIMD spec tests on the x64 backend #2432
Comments
Looks like the issue is related to
|
Looks like another instance of this: https://github.com/bytecodealliance/wasmtime/runs/1476300496?check_suite_focus=true#step:7:1883. It has multiple failures:
|
And another one reported by @alexcrichton in #2470 with |
@abrown Hi .. Thanks. It's hard to say if this issue from 2 weeks ago and the one from 3 days ago is the same but they both involve floating point and both involve failures in tests that shouldn't have been impacted by the subject patch. I will look at the failures seen here: https://github.com/bytecodealliance/wasmtime/runs/1476300496?check_suite_focus=true#step:7:1883 during merging. Try to reproduce and also see what valgrind reports. |
For reference, previously |
FYI, I am trying to see if I can pinpoint the patch that caused the issue. Basically running:
In a loop like so:
where the cargo test command is in ./run_test.sh. It seems the failure usually occurs within 100 iterations but can take as long as 200-300. |
That's some progress! |
@jlb6740 Can you extract from that, the actual non-shell-script program invokation that ./run_test.sh` does, that is failing, and then run that on valgrind, to see if something is reading uninitialised memory? |
I've been attempting to both (1) isolate the patch where the issue starts and (2) simplify the reproducer but both are proving to be more difficult than anticipated. Some observations .. W.r.t pinpointing the patch that introduced the issue, I am finding it impossible to reproduce this issue using the script above before #2365 . However I am not thinking this patch introduces this bug since it was seen as earlier as October here: https://github.com/bytecodealliance/wasmtime/runs/1323686030 W.r.t. simplifying the reproducer I am finding the error almost always occurs with the same tests involving negation or abs .. test such as: (assert_return (invoke "add-neg" (v128.const f64x2 1.125 1.125) or (assert_return (invoke "f64x2.abs" (v128.const f64x2 0x1.fffffffffffffp+1023 0x1.fffffffffffffp+1023)) But when I try to isolate just these tests and run, the problem is not reproduced. It is hard for me to tell if the issue is with something in the lowering or something in the wast testing path. I did run valgrind against a277cf for example and there is a conspicuous error:
Which I think actually involves these lines here: https://github.com/bytecodealliance/wasmtime/blob/main/crates/wast/src/wast.rs#L507-L514 in head. It is not obvious to me though how to correct this yet. I think this bug should be understood and remove as a next step. |
Just to be on the safe side .. is this with valgrind 3.16.0 or later? That has the lowest false-positive rate so far; I would recommend against any investigation based on output from a version before 3.16.0. |
So I was nerdsniped by this and chased it for a while -- with valgrind 3.16.1 (Fedora 33) I'm seeing the uninit value in the
So what I can gather from that is that we're reading uninitialized stack somewhere. I spent a while staring at the generated trampoline functions and am not seeing anything obviously wrong, but is it possible that maybe we're calling the wrong trampoline (too many/too few args or returns)? Another oddity I noticed: there is no |
That reduces the chance that it's a false positive. The clincher would however be if you can demonstrate that multiple runs print different values at that point. |
To confirm @jlb6740, are you testing recent master? #2365 had a use-after-free but if modules/stores were deallocated in the wrong order (later fixed though). That might explain why just after that commit valgrind finds issues. As for |
@julian-seward1 I have valgrind-3.15.0 installed. I will update to a 3.16 or higher. @alexcrichton the valgrind run is based on commit a277cf which was the last time I could readily reproduce. I've tried patches though since @cfallin disabled the tests and could reproduce up until then. Let me rerun the latest valgrind on the latest build with these tests enabled. |
Ah ok in that case I'm not entirely sure what would be causing this unfortunately :( |
@alexcrichton yes, just needs more investigation. I've confirmed that use of unitialized values still occurs with the latest build (e09b940) with valgrind 3.16.1
|
… not being NaN An intermittent failure during SIMD spectests is described in bytecodealliance#2432. This patch corrects code written in a way that assumes comparing fp equality of a register with itself will always return true. This is not true when the register value is NaN as NaN. In this case, and with all ordered comparisons involving NaN, the comparisons will always return false. This patch corrects that assumption for SIMD Fabs and Fneg which seem to be the only instructions generating the failure with bytecodealliance#2432.
…ecodealliance#2470. Enable new tests Re-enables spec tests that were turned off for bytecodealliance#2432 and bytecodealliance#2470 while also enabling tests that now work due to patch pushes in the interim. Currently all SIMD spec tests past. Testing to assure this is ok to enable hasn't been super intense so we should monitor but there was an attempt of doing 1000 runs 3 different times to try and reproduce the issue and it did not occur. In the past would have occurred several times with that many runs.
… not being NaN An intermittent failure during SIMD spectests is described in bytecodealliance#2432. This patch corrects code written in a way that assumes comparing fp equality of a register with itself will always return true. This is not true when the register value is NaN as NaN. In this case, and with all ordered comparisons involving NaN, the comparisons will always return false. This patch corrects that assumption for SIMD Fabs and Fneg which seem to be the only instructions generating the failure with bytecodealliance#2432.
…ecodealliance#2470. Enable new tests Re-enables spec tests that were turned off for bytecodealliance#2432 and bytecodealliance#2470 while also enabling tests that now work due to patch pushes in the interim. Currently all SIMD spec tests past. Testing to assure this is ok to enable hasn't been super intense so we should monitor but there was an attempt of doing 1000 runs 3 different times to try and reproduce the issue and it did not occur. In the past would have occurred several times with that many runs.
… not being NaN An intermittent failure during SIMD spectests is described in #2432. This patch corrects code written in a way that assumes comparing fp equality of a register with itself will always return true. This is not true when the register value is NaN as NaN. In this case, and with all ordered comparisons involving NaN, the comparisons will always return false. This patch corrects that assumption for SIMD Fabs and Fneg which seem to be the only instructions generating the failure with #2432.
…ests Re-enables spec tests that were turned off for #2432 and #2470 while also enabling tests that now work due to patch pushes in the interim. Currently all SIMD spec tests past. Testing to assure this is ok to enable hasn't been super intense so we should monitor but there was an attempt of doing 1000 runs 3 different times to try and reproduce the issue and it did not occur. In the past would have occurred several times with that many runs.
@jlb6740, I am opening this issue to track a failure I noticed with x64 SIMD: https://github.com/bytecodealliance/wasmtime/pull/2428/checks?check_run_id=1421136203#step:7:1582
The text was updated successfully, but these errors were encountered: