Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

If Z3 can't be run, Dafny sometimes hangs instead of reporting an error #2370

Closed
mattmccutchen-amazon opened this issue Jul 6, 2022 · 18 comments · Fixed by #3233
Closed
Assignees
Labels
area: error-reporting Clarity of the error reporting area: new users Getting new users started kind: bug Crashes, unsoundness, incorrect output, etc. If possible, add a `part:` label part: boogie Happens after passing the program to Boogie

Comments

@mattmccutchen-amazon
Copy link

In some cases in which Dafny can't find Z3, Dafny hangs instead of reporting the expected error "Cannot find any prover executable". This can make it confusing for the user to troubleshoot the problem. Failure to find Z3 isn't likely to happen when using a binary release of Dafny that has Z3 bundled at the right path, but it can happen when building Dafny from source or using a custom build system.

Steps to reproduce (tested on my Linux machine):

  1. Download and extract https://github.com/dafny-lang/dafny/releases/download/v3.7.1/dafny-3.7.1-x64-ubuntu-16.04.zip.
  2. Delete the dafny/z3 directory.
  3. Save the example code below to hang-test.dfy.
  4. Run: ./dafny/dafny /compile:0 hang-test.dfy

Example code:

lemma Dummy1(x: int)
  requires x == 1
  ensures x + 1 == 2 {}

lemma Dummy2(x: int)
  requires x == 1
  ensures x + 1 == 2 {}

lemma Dummy3(x: int)
  requires x == 1
  ensures x + 1 == 2 {}

Expected behavior: Dafny exits with an error message, "Cannot find any prover executable".

Actual behavior: Dafny hangs.

Some further testing I did suggests that the hang happens when the number of methods to be verified is greater than the value of the /vcsCores option, which defaults to 1. With the example code above (which has 3 methods), I observe:

  • ./dafny/dafny /compile:0 /vcsCores:3 hang-test.dfy -> Cannot find any prover executable (within a few seconds)
  • ./dafny/dafny /compile:0 /vcsCores:2 hang-test.dfy -> hang

So this may be some kind of deadlock in a thread pool.

@mattmccutchen-amazon
Copy link
Author

I just noticed #1914. The two issues may have the same root cause, in which case it probably makes sense to broaden one of them and mark the other as a duplicate.

@atomb atomb added kind: bug Crashes, unsoundness, incorrect output, etc. If possible, add a `part:` label part: boogie Happens after passing the program to Boogie area: new users Getting new users started area: error-reporting Clarity of the error reporting labels Sep 6, 2022
@robin-aws
Copy link
Member

I just noticed #1914. The two issues may have the same root cause, in which case it probably makes sense to broaden one of them and mark the other as a duplicate.

I suspect they will need separate fixes, since the CLI should print out an error but the language server will need to provide a different notification. Perfectly fine for one PR to fix both though. :)

@atomb atomb self-assigned this Dec 13, 2022
@keyboardDrummer
Copy link
Member

I've had Dafny verification hang when it passed incorrect parameters to Z3, which happened because it passed parameters for Z3 4.8.5, while it was using Z3 4.8.8

@MikaelMayer
Copy link
Member

How long ago did you have that issue @keyboardDrummer ?

@keyboardDrummer
Copy link
Member

Today

@MikaelMayer
Copy link
Member

CLI or language server? If CLI, what was the command line you ran?

@keyboardDrummer
Copy link
Member

The language server, it passed O:model_compress=false to Z3 4.8.8 which then reports a solver error because it expects O:model.compact=false, but Dafny keeps waiting on a Boogie Task that never completes.

@keyboardDrummer
Copy link
Member

keyboardDrummer commented Dec 14, 2022

The language server is supposed to have code to send the right options to Z3, the method HandleZ3Version, but apparently that wasn't working. I had trouble attaching a debugger in time because that code runs on startup.

@MikaelMayer
Copy link
Member

Oh that's concerning.
To attach a debugger, I found that you can add Thread.sleep(15000) at the beginning of Program, which gives you 15 seconds to attach the debugger.

atomb added a commit to atomb/dafny that referenced this issue Dec 20, 2022
* Be more explicit about the path of the executable to invoke.
* Identify the version and set different Z3 options accordingly.

Fixes dafny-lang#2370
Enables a fix for dafny-lang#1477
Interacts with dafny-lang#1914 (which may already be fixed)
atomb added a commit that referenced this issue Jan 9, 2023
* Be more explicit about the path of the executable to invoke.
* Identify the version and set different Z3 options accordingly.

Fixes #2370
Enables a fix for #1477
Interacts with #1914 (which may already be fixed)

By submitting this pull request, I confirm that my contribution is made under
the terms of the MIT license.
atomb added a commit to atomb/dafny that referenced this issue Jan 10, 2023
* Be more explicit about the path of the executable to invoke.
* Identify the version and set different Z3 options accordingly.

Fixes dafny-lang#2370
Enables a fix for dafny-lang#1477
Interacts with dafny-lang#1914 (which may already be fixed)

By submitting this pull request, I confirm that my contribution is made under
the terms of the MIT license.
davidcok pushed a commit to davidcok/dafny that referenced this issue Jan 20, 2023
* Be more explicit about the path of the executable to invoke.
* Identify the version and set different Z3 options accordingly.

Fixes dafny-lang#2370
Enables a fix for dafny-lang#1477
Interacts with dafny-lang#1914 (which may already be fixed)

By submitting this pull request, I confirm that my contribution is made under
the terms of the MIT license.
@atomb
Copy link
Member

atomb commented Apr 11, 2023

This still seems to occur sometimes. We should more thoroughly check the case where a z3 binary exists but can't be executed for some reason (being an incompatible version, having missing shared library dependencies, etc.).

@atomb atomb reopened this Apr 11, 2023
@alex-chew
Copy link
Contributor

Do we currently run any kind of "smoke test" against the specified solver binary, to check that the binary is available and provides a known result, before using it? I see that we run z3 -version but fail silently if we don't get anything back ("might be another solver").

@atomb
Copy link
Member

atomb commented Apr 11, 2023

Do we currently run any kind of "smoke test" against the specified solver binary, to check that the binary is available and provides a known result, before using it? I see that we run z3 -version but fail silently if we don't get anything back ("might be another solver").

Not at the moment. I think it might be good to do that. And I think perhaps we should fail if we don't get anything back from z3 -version, since that should always produce something. I think perhaps we should only support other solvers (e.g., CVC5) if they're explicitly specified with a path to the executable, since using them is very experimental.

@mattmccutchen-amazon
Copy link
Author

Stepping back: How did a failure to run the solver cause Dafny to hang in the first place? Can we change that hang to an error rather than relying solely on up-front smoke tests? No matter how many smoke tests we add, we may overlook some unusual scenario that may affect some user.

@atomb
Copy link
Member

atomb commented Apr 11, 2023

I've looked into this a little, and it's unfortunately a bit tricky to do. From what I can tell, the .NET Process class that we use to invoke the solver hangs if you invoke it with an executable that doesn't exist or can't be executed for some other reason. I would have expected it to throw an exception, but it doesn't seem to. I suspect there's a way to make it work, but it's not as straightforward as it should be.

@mattmccutchen-amazon mattmccutchen-amazon changed the title If Z3 isn't found, Dafny sometimes hangs instead of reporting an error If Z3 can't be run, Dafny sometimes hangs instead of reporting an error Apr 17, 2023
@mattmccutchen-amazon
Copy link
Author

We should more thoroughly check the case where a z3 binary exists but can't be executed for some reason

I updated the title accordingly. The case I'm interested in right now is having broken shared library dependencies since I've run into it in my work at Amazon.

From what I can tell, the .NET Process class that we use to invoke the solver hangs if you invoke it with an executable that doesn't exist or can't be executed for some other reason.

I find it hard to believe that such an important, widely used class would have that kind of fundamental bug; it seems much more likely to me that the problem is elsewhere. So out of curiosity, I debugged Dafny in my current case where Z3 has broken shared library dependencies, and I think I found the root cause of the hang in that case. This line propagates an IOException without releasing checkersSemaphore. Here's the stack trace in case it's of interest:

Broken pipe
   at System.IO.Pipes.PipeStream.WriteCore(ReadOnlySpan`1 buffer)
   at System.IO.Pipes.PipeStream.Write(ReadOnlySpan`1 buffer)
   at System.IO.StreamWriter.Flush(Boolean flushStream, Boolean flushEncoder)
   at System.IO.StreamWriter.WriteLine(String value)
   at Microsoft.Boogie.SMTLib.SMTLibProcess.Send(String cmd) in REDACTED/boogie/Source/Provers/SMTLib/SMTLibProcess.cs:line 131
   at Microsoft.Boogie.SMTLib.SMTLibInteractiveTheoremProver.Send(String s, Boolean isCommon) in REDACTED/boogie/Source/Provers/SMTLib/SMTLibInteractiveTheoremProver.cs:line 664
   at Microsoft.Boogie.SMTLib.SMTLibProcessTheoremProver.SendThisVC(String s) in REDACTED/boogie/Source/Provers/SMTLib/SMTLibProcessTheoremProver.cs:line 165
   at Microsoft.Boogie.SMTLib.SMTLibInteractiveTheoremProver.FullReset(VCExpressionGenerator generator) in REDACTED/boogie/Source/Provers/SMTLib/SMTLibInteractiveTheoremProver.cs:line 167
   at Microsoft.Boogie.Checker.Target(Program prog, ProverContext ctx, Split split) in REDACTED/boogie/Source/VCGeneration/Checker.cs:line 140
   at VC.CheckerPool.PrepareChecker(Program program, Split split, Checker checker) in REDACTED/boogie/Source/VCGeneration/CheckerPool.cs:line 75
   at VC.CheckerPool.FindCheckerFor(ConditionGeneration vcgen, Split split, CancellationToken cancellationToken) in REDACTED/boogie/Source/VCGeneration/CheckerPool.cs:line 36

The IOException is correctly handled here, but Dafny goes on to verify the next method without checkersSemaphore ever being released. checkersSemaphore is initialized to vcsCores, so if there are more than vcsCores methods, verification of the (vcsCores + 1)-th method will hang waiting for the semaphore, explaining the behavior I saw.

Note: When I tried the new command line format (dafny verify hang-test.dfy), at first it appeared that the problem did not occur, but I was able to trigger it by increasing the number of methods in the input. It seems that dafny verify has a higher default value of vcsCores than the legacy dafny /compile:0.

If I make the following change to the Boogie code, then Dafny no longer hangs:

diff --git a/Source/VCGeneration/CheckerPool.cs b/Source/VCGeneration/CheckerPool.cs
index e4c45182..d51fef17 100644
--- a/Source/VCGeneration/CheckerPool.cs
+++ b/Source/VCGeneration/CheckerPool.cs
@@ -32,8 +32,13 @@ public async Task<Checker> FindCheckerFor(ConditionGeneration vcgen, Split? spli
         checker ??= CreateNewChecker();
       }
 
-      PrepareChecker(vcgen.program, split, checker);
-      return checker;
+      try {
+        PrepareChecker(vcgen.program, split, checker);
+        return checker;
+      } catch (System.IO.IOException) {
+        checkersSemaphore.Release();
+        throw;
+      }
     }
 
     private int createdCheckers;

I'm sharing this as a proof of concept. There may be any number of reasons it may not be the best fix. For example, maybe we should call checker.Close() in the catch block. And it would probably need tests, etc. My testing was based on Boogie v2.16.4 because that's the version that Dafny uses and when I tried to build Dafny against the latest Boogie, there were compile errors. But it looks like the relevant code hasn't changed in the latest Boogie.

Again, this is for the case where Z3 has broken shared library dependencies. I don't know whether there are other ways in which Z3 could fail that would cause the Dafny/Boogie code to fail in a different place.

@atomb
Copy link
Member

atomb commented Apr 18, 2023

Thanks for digging so deeply into debugging the issue! That should be a straightforward fix to apply.

@atomb
Copy link
Member

atomb commented Apr 26, 2023

And @keyboardDrummer ended up fixing it: boogie-org/boogie#719

I'll leave this open until we've updated Dafny to depend on that Boogie version and tested it from the Dafny side, though.

@atomb
Copy link
Member

atomb commented Jan 8, 2024

The Boogie fix was incorporated in 6675361. The problem doesn't appear to exist anymore, in the tests I've tried.

@atomb atomb closed this as completed Jan 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: error-reporting Clarity of the error reporting area: new users Getting new users started kind: bug Crashes, unsoundness, incorrect output, etc. If possible, add a `part:` label part: boogie Happens after passing the program to Boogie
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants