Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AOT-ReleaseX64 crash divergence #35335

Closed
aartbik opened this issue Dec 6, 2018 · 11 comments
Closed

AOT-ReleaseX64 crash divergence #35335

aartbik opened this issue Dec 6, 2018 · 11 comments
Assignees
Labels
area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. dartfuzz Found with Dart fuzzing (DartFuzz, libFuzzer, etc.)

Comments

@aartbik
Copy link
Contributor

aartbik commented Dec 6, 2018

Isolate (/b/s/w/ir/tmp/t/dart_fuzzYMYQOP) AOT-ReleaseX64 - JIT-ReleaseIA32: !DIVERGENCE! 1.2:2915825161 (output=false)

fail1:
../../runtime/vm/compiler/aot/precompiler.cc: 2312: error: unreachable code
thread=27829, isolate=isolate(0x56499373c900)
[0x000056499136478c] Unknown symbol
[0x000056499136478c] Unknown symbol
[0x0000564991565872] Unknown symbol
[0x000056499141a1e8] Unknown symbol
[0x000056499141b8fa] Unknown symbol
[0x0000564991416c25] Unknown symbol
[0x0000564991415647] Unknown symbol
[0x00005649914107b4] Unknown symbol
[0x000056499140e273] Unknown symbol
[0x000056499140e135] Unknown symbol
[0x0000564991562f9a] Unknown symbol
[0x00005649911a016e] Unknown symbol
-- End of DumpStackTrace

@aartbik aartbik added area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. dartfuzz Found with Dart fuzzing (DartFuzz, libFuzzer, etc.) labels Dec 6, 2018
@aartbik aartbik self-assigned this Dec 6, 2018
@aartbik
Copy link
Contributor Author

aartbik commented Dec 6, 2018

Debug AOT crashes as:

flow_graph_compiler.cc: 951: error: expected: speculative_policy_->AllowsSpeculativeInlining()


Label* FlowGraphCompiler::AddDeoptStub(intptr_t deopt_id,
                                       ICData::DeoptReasonId reason,
                                       uint32_t flags) {
....
    ASSERT(speculative_policy_->AllowsSpeculativeInlining());
....
}

while release AOT hits the error above:

runtime/vm/compiler/aot/precompiler.cc: 2333: error: unreachable code


bool PrecompileParsedFunctionHelper::Compile(CompilationPipeline* pipeline) {
....
       if (!speculative_policy.AllowsSpeculativeInlining()) {
          // Assert that we don't repeatedly retry speculation.
          UNREACHABLE();
        }
....
}

@aartbik
Copy link
Contributor Author

aartbik commented Dec 6, 2018

Bit more details:

Retrying compilation file:///usr/local/google/home/ajcbik/Repo/DART/fuzz.dart_::foo3,
suppressing inlining of deopt_id:2394
../../runtime/vm/compiler/backend/flow_graph_compiler.cc: 947:
error: expected: speculative_policy
->AllowsSpeculativeInlining()

which in both the first attempt and the retry is due to the following (even though temp index changes):

v9346 <- Unbox:2394(v0 T{Null}) T{_Double}

Introduced by LICM:

Before:

B267[target]:2014
    v9328 <- UnboxInt64(v2198 T{int})
    v2304 <- BinaryInt64Op(% [tr], v9328 T{int}, v2188 T{int}) T{int}
    goto:2394 B321
B321[join]:2392 pred(B267, B320) {
      v2308 <- phi(v13, v9382 T{int}) alive T{int}
}
....
B319[target]:2384
    v3979 <- Redefinition(v5942 T{_Double?}) T{_Double}
    v9346 <- Unbox(v8916)                 <--- can not deopt!
    goto:2390 B320

After:

B267[target]:2014
    v9328 <- UnboxInt64(v2198 T{int})
    v2304 <- BinaryInt64Op(% [tr], v9328 T{int}, v2188 T{int}) T{int}
    v9634 <- UnboxedConstant:2394(#10)
    v9636 <- UnboxedConstant:2394(#20)
    v9638 <- UnboxedConstant:2394(#4)
    v9346 <- Unbox:2394(v0 T{Null})                       <---- can deopt!
    v9640 <- UnboxedConstant:2394(#5e-324)
    goto:2394 B321

@aartbik
Copy link
Contributor Author

aartbik commented Dec 6, 2018

The problem is that the CanDeoptimize() property changes due to hoisting. The unwritten contract, of course, is that this does not change.

about to hoist:
 v9346 <- Unbox(v8916)         can_deopt=F env=null
after hoist:
 v9346 <- Unbox:2394(v8916)    can_deopt=T env=non-null

This is due to copying a non-empty environment into an unbox that formerly had no environment. Hence the deopt property changes (due to the way CanDeoptimize() is implemented). In most typical cases, the CanDeoptimize() property does not change, even when the environment changes. So it seems that either a stricter test is needed or the instruction should have had an env to start with!

This particular unbox is generated during representation selection, using FlowGraph::InsertConversion(). Perhaps we miss copying an env there?

@aartbik
Copy link
Contributor Author

aartbik commented Dec 10, 2018

New nightly run has several failures that are probably very related (now the CanDeoptimize() assert is violated for an UnboxInstr, somewhat similar to above). So adding one failure here to make sure we verify this once fixed.

JIT-DebugSIMARM - AOT-DebugX64: !DIVERGENCE! 1.2:3494344839 (output=false)

fail2:
../../runtime/vm/compiler/backend/il.cc: 4617: error: expected: CanDeoptimize()

This seems indeed reflective of the same problem. The representation selection pass introduces

v5337 <- Unbox(v5135)

which may deoptimize, but has no environment, and thus returns false for CanDeoptimize().

@aartbik
Copy link
Contributor Author

aartbik commented Dec 12, 2018

Note, continuing nightly failures with similar error messages. This bug really needs fixing!

@mraleph
Copy link
Member

mraleph commented Dec 13, 2018

I have looked at the graph.

This happens in the unreachable dead code: we have

B266[join]:64 pred(B1505, B1506)
    CheckNull:2004(v0)
    v8928 <- Redefinition(v0) T{Null}

The type of this redefinition is non-Nullable Null - empty set - this should be impossible.

Then this flows into

B320[join]:2386 pred(B318, B319) {
      v2332 <- phi(v729, v8928) alive T{_Double}
}

note that phi was inferred to have T{_Double} because v729 is _Double and v8928 is essentially impossible type.

I think we need to be aggressively deleting impossible code like this.

Alternative fix is to make sure that type propagation does not produce impossible types like this:

diff --git a/runtime/vm/compiler/backend/type_propagator.cc b/runtime/vm/compiler/backend/type_propagator.cc
index c587626f12..791cba021f 100644
--- a/runtime/vm/compiler/backend/type_propagator.cc
+++ b/runtime/vm/compiler/backend/type_propagator.cc
@@ -283,7 +283,7 @@ void FlowGraphTypePropagator::VisitCheckClassId(CheckClassIdInstr* check) {
 void FlowGraphTypePropagator::VisitCheckNull(CheckNullInstr* check) {
   Definition* receiver = check->value()->definition();
   CompileType* type = TypeOf(receiver);
-  if (type->is_nullable()) {
+  if (type->is_nullable() && !type->IsNull()) {
     // Insert redefinition for the receiver to guard against invalid
     // code motion.
     EnsureMoreAccurateRedefinition(check, receiver, type->CopyNonNullable());
@@ -305,7 +305,7 @@ void FlowGraphTypePropagator::CheckNonNullSelector(
   if (target.IsNull()) {
     // If the selector is not defined on Null, we can propagate non-nullness.
     CompileType* type = TypeOf(receiver);
-    if (type->is_nullable()) {
+    if (type->is_nullable() && !type->IsNull()) {
       // Insert redefinition for the receiver to guard against invalid
       // code motion.
       EnsureMoreAccurateRedefinition(call, receiver, type->CopyNonNullable());

@aartbik
Copy link
Contributor Author

aartbik commented Dec 13, 2018

Seeds to check after fix (it is always the AOT one that crashes):

AOT-ReleaseX64 - KBC-CMP-ReleaseX64: !DIVERGENCE! 1.2:3060628533 (output=false)
JIT-MARKSWEEPEVERY-DebugSIMDBC - AOT-ReleaseX64: !DIVERGENCE! 1.2:3495203332 (output=false)
JIT-NOFIELDGUARDS-ReleaseSIMDBC64 - AOT-ReleaseX64: !DIVERGENCE! 1.2:1470560046 (output=false)
AOT-ReleaseX64 - JIT-ReleaseSIMARM: !DIVERGENCE! 1.2:1253864548 (output=false)

dart-bot pushed a commit that referenced this issue Dec 13, 2018
Rationale:
Reporting check-class instead of the proper unbox
yields an incorrect compiler diagnostic, which may
obscure subsequent debugging.

#35335

Change-Id: I99358ef3432e77b4432d9a96755747eaadc27067
Reviewed-on: https://dart-review.googlesource.com/c/87161
Reviewed-by: Vyacheslav Egorov <vegorov@google.com>
Reviewed-by: Aart Bik <ajcbik@google.com>
Commit-Queue: Aart Bik <ajcbik@google.com>
@aartbik
Copy link
Contributor Author

aartbik commented Dec 13, 2018

Slava's type prop change fixes all known seeds so far related to AOT crashes:

AOT fuzz1253864548.dart
AOT fuzz1470560046.dart
AOT fuzz1907435290.dart
AOT fuzz2915825161.dart
AOT fuzz3060628533.dart
AOT fuzz307320705.dart
AOT fuzz3321921762.dart
AOT fuzz3494344839.dart
AOT fuzz3495203332.dart
AOT fuzz925136549.dart

dart-bot pushed a commit that referenced this issue Dec 13, 2018
Rationale:
Having a literal null checked by CheckNull with
a subsequent Redefinition resulted in some strange
unboxing that crashed AOT (fix courtesy Slava!).
Note that we still have some ambiguity around
adding/removing environments from instructions
that may deoptimize, but this change fixes all
prior related DartFuzz failures.

#35335

Change-Id: Ifb50d8cddf93e57758b2bbb83ad397ea281e9307
Reviewed-on: https://dart-review.googlesource.com/c/87280
Reviewed-by: Vyacheslav Egorov <vegorov@google.com>
Commit-Queue: Aart Bik <ajcbik@google.com>
@aartbik
Copy link
Contributor Author

aartbik commented Dec 13, 2018

\O/

@aartbik aartbik closed this as completed Dec 13, 2018
@aartbik
Copy link
Contributor Author

aartbik commented Jan 16, 2019

Note that simply ignoring null may not have been the right solution. We should remove the following dead code, just to make sure it is never hoisted before the check.

@aartbik aartbik reopened this Jan 16, 2019
@mraleph
Copy link
Member

mraleph commented Jan 16, 2019

As a temporary solution we could still create a redefinition after the CheckNull just not set it's type to impossible type.

dart-bot pushed a commit that referenced this issue Jan 17, 2019
Rationale:
Previously, we avoided introducing redefinitions that
introduced the empty non-nullable null type. This situation
arises when we do a null check on an actual null value
(making all subsequent uses effectively dead code). This
is too simple, however, since it still allows hoisting the
uses before the check. This CL gives a better solution
by introducing redefinitions without a constraining type
(which are not removed and avoid the type). In the long
run perhaps the best solution would be to simply remove
all subsequent uses as dead.

#32167
#34473
#35335

Change-Id: Ib5dd072a9e546f6b91faa52ea08e8c0f6350d7e0
Reviewed-on: https://dart-review.googlesource.com/c/89922
Reviewed-by: Alexander Markov <alexmarkov@google.com>
Commit-Queue: Aart Bik <ajcbik@google.com>
@aartbik aartbik closed this as completed Jan 17, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. dartfuzz Found with Dart fuzzing (DartFuzz, libFuzzer, etc.)
Projects
None yet
Development

No branches or pull requests

2 participants