Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Solver result caching #1873

Merged
merged 53 commits into from
Jul 27, 2023
Merged
Show file tree
Hide file tree
Changes from 49 commits
Commits
Show all changes
53 commits
Select commit Hold shift + click to select a range
bf22047
quick and dirty solver caching
m-yac May 8, 2023
e6a4513
move files, fix cache loading
m-yac May 8, 2023
d648970
add --cache command-line option
m-yac May 8, 2023
92aee03
save SolverStats when caching solver results
m-yac May 9, 2023
70b9be0
use strings as solver result cache keys since Terms have non-unique h…
m-yac May 9, 2023
072cf1a
save solver result cache on :q
m-yac May 9, 2023
d870ea6
use sha256 for solver result caching, save cache after each REPL command
m-yac May 10, 2023
f05bc40
remove 'x' prefix from solver result cache files
m-yac May 10, 2023
cd5cc1b
improve hash function for SolverCacheKey, reorganize SolverCache.hs
m-yac May 10, 2023
e5dcdda
load and save cached results only when needed
m-yac May 10, 2023
63eed88
add print_solver_cache_stats
m-yac May 10, 2023
9fb0c6d
use prettyprinting functions for showing FiniteValues and FirstOrderV…
m-yac May 11, 2023
37be6ca
factor `sequentToSATQuery` out of SMT backends, cache counterexamples
m-yac May 16, 2023
7e264c5
anonymize local names in satQueryToSolverCacheKey
m-yac May 17, 2023
3cc4d1e
make SolverCache an IORef in TopLevel so entries persist after failure
m-yac May 18, 2023
084dd14
add documentation to SolverCache.hs
m-yac May 19, 2023
fcef4ec
include solver backend versions in solver caching
m-yac May 20, 2023
e5ac59d
use LMDB for solver result caching
m-yac May 23, 2023
7de326a
use CI scripts from input-output-hk/haskell-lmdb
m-yac May 24, 2023
b3b46f8
remove cp of lmdb pkgconfig, add `lib` and `include` to PATH for Windows
m-yac May 24, 2023
03eec2e
cleanup, revert most of 9fb0c6d5b1da9165d425750f443307ca97979b4a
m-yac May 24, 2023
751f54d
get lmdb-simple working with GHC 9.2.7
m-yac May 24, 2023
0719bda
add clean_solver_cache command
m-yac May 24, 2023
dec00ff
try to get LMDB on Windows working on CI
m-yac May 24, 2023
3a77eb8
use Python LMDB bindings for solver caching (TODO docs)
m-yac Jun 20, 2023
c877623
quick clean up to solver caching code
m-yac Jun 20, 2023
d7ae967
replace solver caching CLI option with an env variable
m-yac Jun 20, 2023
9eae671
add docs to solver caching code
m-yac Jun 20, 2023
c36cb0d
add test for solver result caching
m-yac Jun 21, 2023
f173acb
expand tests for solver caching
m-yac Jun 21, 2023
bfb1b79
actually install lmdb on the CI for solver caching tests
m-yac Jun 21, 2023
2354bb2
set -e in test_solver_cache
m-yac Jun 22, 2023
16cfe0e
WIP testing saving LMDB caches on CI
m-yac Jun 22, 2023
897f8ff
WIP more CI debugging
m-yac Jun 27, 2023
c040ab5
remove hex-text dependency, confirm no exns with units in LMDBOptData…
m-yac Jun 27, 2023
c68801d
get SBV version through cabal VERSION macro, not freeze file
m-yac Jun 27, 2023
964219c
fix style issues / typos found by @RyanGlScott
m-yac Jun 27, 2023
b3575ea
add forgotten base case to decodeHex
m-yac Jun 27, 2023
6a5ec67
pip install lmdb in test_solver_cache/test.sh
m-yac Jun 27, 2023
0e2efb2
improve test_solver_cache_stats args, clean up debugging code, add docs
m-yac Jun 28, 2023
bbdfdf0
update the SAW manual with information about solver caching
m-yac Jun 28, 2023
7721cfb
fix/improve getSolverVersion for yices and others
m-yac Jul 18, 2023
fa146c8
solver caching: use lmdb-simple, make env var lazy, save timestamps
m-yac Jul 21, 2023
c426c71
move python interface for SAW solver caches to saw-remote-api
m-yac Jul 25, 2023
aea7c08
update docs for solver caching
m-yac Jul 25, 2023
b29790c
update solver cache tests, fix but in clean_solver_cache
m-yac Jul 25, 2023
9bd0bd3
Merge remote-tracking branch 'origin/master' into solver-caching
m-yac Jul 25, 2023
5f2f2f9
update solver cache testing on CI
m-yac Jul 25, 2023
c28e340
improve layout of docstrings for solver_cache_stats commands
m-yac Jul 25, 2023
bc446cf
make SolverCacheOp a datatype for clarity
m-yac Jul 26, 2023
2fde0df
Merge remote-tracking branch 'origin/master' into solver-caching
m-yac Jul 26, 2023
9c5c394
resolve some comments from @RyanGlScott
m-yac Jul 26, 2023
a960c36
move JSON instances and add associated docs to FiniteValue.hs
m-yac Jul 26, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitmodules
Original file line number Diff line number Diff line change
Expand Up @@ -46,3 +46,6 @@
[submodule "deps/language-sally"]
path = deps/language-sally
url = https://github.com/GaloisInc/language-sally
[submodule "deps/lmdb"]
path = deps/lmdb
url = https://github.com/GaloisInc/lmdb.git
32 changes: 22 additions & 10 deletions Setup.hs
Original file line number Diff line number Diff line change
@@ -1,11 +1,12 @@
import Control.Exception
import Control.Monad (unless)
import Data.List (find)
import Distribution.Simple
import Distribution.Simple.BuildPaths (autogenPackageModulesDir)
import Distribution.PackageDescription (emptyHookedBuildInfo, allBuildInfo)
import System.Directory (createDirectoryIfMissing, findExecutable)
import System.Directory
import System.FilePath ((</>))
import System.Process (readProcess)
import System.Process (readProcess, callProcess)
import System.Exit

main = defaultMainWithHooks myHooks
Expand All @@ -17,20 +18,31 @@ myBuild pd lbi uh flags = do

hasGit <- findExecutable "git"

let gitfailure :: SomeException -> IO String
gitfailure _e = return "<non-dev-build> "
let gitfailure :: a -> SomeException -> IO a
gitfailure a _e = return a

desc <- case hasGit of
Just git -> readProcess "git" ["describe", "--always", "--dirty"] ""
`catch` gitfailure
Nothing -> return "<VCS-less build> "
let gitdescribe dir m on_no_exe on_fail = case hasGit of
Just exe -> withCurrentDirectory dir (m <$>
readProcess "git" ["describe", "--always", "--dirty"] "")
`catch` gitfailure on_fail
Nothing -> return on_no_exe

desc <- gitdescribe "." init "<VCS-less build>" "<non-dev-build>"
aig_desc <- gitdescribe "deps/aig" (Just . init) Nothing Nothing
w4_desc <- gitdescribe "deps/what4" (Just . init) Nothing Nothing

writeFile (dir </> "GitRev.hs") $ unlines
[ "module GitRev where"
, "-- | String describing the HEAD of saw-script at compile-time"
, "hash :: String"
, "hash = " ++ show (init desc)
, "hash = " ++ show desc
m-yac marked this conversation as resolved.
Show resolved Hide resolved
, "-- | String describing the HEAD of the deps/aig submodule at compile-time"
, "aigHash :: Maybe String"
, "aigHash = " ++ show aig_desc
, "-- | String describing the HEAD of the deps/what4 submodule at compile-time"
, "what4Hash :: Maybe String"
, "what4Hash = " ++ show w4_desc
]

unless (null $ allBuildInfo pd) $
(buildHook simpleUserHooks) pd lbi uh flags

2 changes: 2 additions & 0 deletions cabal.project
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,8 @@ packages:
deps/parameterized-utils
deps/flexdis86
deps/flexdis86/binary-symbols
deps/lmdb/lmdb
deps/lmdb/lmdb-simple
deps/macaw/base
deps/macaw/symbolic
deps/macaw/x86
Expand Down
1 change: 1 addition & 0 deletions deps/lmdb
Submodule lmdb added at d454c8
101 changes: 101 additions & 0 deletions doc/manual/manual.md
Original file line number Diff line number Diff line change
Expand Up @@ -109,6 +109,13 @@ SAW also uses several environment variables for configuration:
or the `PATH` environment variable is used, as SAW can use this information
to determine the location of the core Java libraries' `.jar` file.

`SAW_SOLVER_CACHE_PATH`

~ Specify a path at which to keep a cache of solver results obtained during
calls to certain tactics. A cache is not created at this path until it is
needed. See the section **Caching Solver Results** for more detail about this
feature.

On Windows, semicolon-delimited lists are used instead of colon-delimited
lists.

Expand Down Expand Up @@ -1244,6 +1251,100 @@ differences in how each library represents certain SMT queries. There are also
some experimental features that are only supported with What4 at the moment,
such as `enable_lax_loads_and_stores`.

## Caching Solver Results

SAW has the capability to cache the results of tactics which call out to
automated provers. This can save a considerable amount of time in cases such as
proof development and CI, where the same proof scripts are often run repeatedly
without changes.

This caching is available for all tactics which call out to automated provers
at runtime: `abc`, `boolector`, `cvc4`, `cvc5`, `mathsat`, `yices`, `z3`,
`rme`, and the family of `unint` tactics described in the previous section.

When solver caching is enabled and one of the tactics mentioned above is
encountered, if there is already an entry in the cache corresponding to the
call then the cached result is used, otherwise the appropriate solver is
queried, and the result saved to the cache. Entries are indexed by a SHA256
hash of the exact query to the solver (ignoring variable names), any options
passed to the solver, and the names and full version strings of all the solver
backends involved (e.g. ABC and SBV for the `abc` tactic). This ensures cached
results are only used when they would be identical to the result of actually
running the tactic.

The simplest way to enable solver caching is to set the environment variable
`SAW_SOLVER_CACHE_PATH`. With this environment variable set, `saw` and
`saw-remote-api` will automatically keep an [LMDB](http://www.lmdb.tech/doc/)
database at the given path containing the solver result cache. Setting this
environment variable globally therefore creates a global, concurrency-safe
solver result cache used by all newly created `saw` or `saw-remote-api`
processes. Note that when this environment variable is set, SAW does not create
a cache at the specified path until it is actually needed.

There are also a number of SAW commands related to solver caching.

* `set_solver_cache_path` is like setting `SAW_SOLVER_CACHE_PATH` for the
remainder of the current session, but opens an LMDB database at the specified
path immediately. If a cache is already in use in the current session
(i.e. through a prior call to `set_solver_cache_path` or through
`SAW_SOLVER_CACHE_PATH` being set and the cache being used at least once)
then all entries in the cache already in use will be copied to the new cache
being opened.

* `clean_solver_cache` will remove all entries in the solver result cache
which were created using solver backend versions which do not match the
versions in the current environment. This can be run after an update to
clear out any old, unusable entries from the solver cache.

* `print_solver_cache` prints to the console all entries in the cache whose
SHA256 hashe keys start with the given hex string. Providing an empty string
m-yac marked this conversation as resolved.
Show resolved Hide resolved
results in all entries in the cache being printed.

* `print_solver_cache_stats` prints to the console statistics including the
size of the solver cache, where on disk it is stored, and some counts of how
often it has been used during the current session.

For performing more complicated database operations on the set of cached
results, the file `solver_cache.py` is provided with the Python bindings of the
SAW Remote API. This file implements a general-purpose Python interface for
interacting with the LMDB databases kept by SAW for solver caching.

Below is an example of using solver caching with `saw -v Debug`. Only the
relevant output is shown, the rest abbreviated with "...".

~~~~
sawscript> set_solver_cache_path "example.cache"
sawscript> prove_print z3 {{ \(x:[8]) -> x+x == x*2 }}
[22:13:00.832] Caching result: d1f5a76e7a0b7c01 (SBV 9.2, Z3 4.8.7 - 64 bit)
...
sawscript> prove_print z3 {{ \(new:[8]) -> new+new == new*2 }}
[22:13:04.122] Using cached result: d1f5a76e7a0b7c01 (SBV 9.2, Z3 4.8.7 - 64 bit)
...
sawscript> prove_print (w4_unint_z3_using "qfnia" []) \
{{ \(x:[8]) -> x+x == x*2 }}
[22:13:09.484] Caching result: 4ee451f8429c2dfe (What4 v1.3-29-g6c462cd using qfnia, Z3 4.8.7 - 64 bit)
...
sawscript> print_solver_cache "d1f5a76e7a0b7c01"
[22:13:13.250] SHA: d1f5a76e7a0b7c01bdfe7d0e1be82b4f233a805ae85a287d45933ed12a54d3eb
[22:13:13.250] - Result: unsat
[22:13:13.250] - Solver: "SBV->Z3"
[22:13:13.250] - Versions: SBV 9.2, Z3 4.8.7 - 64 bit
[22:13:13.250] - Last used: 2023-07-25 22:13:04.120351 UTC

sawscript> print_solver_cache "4ee451f8429c2dfe"
[22:13:16.727] SHA: 4ee451f8429c2dfefecb6216162bd33cf053f8e66a3b41833193529449ef5752
[22:13:16.727] - Result: unsat
[22:13:16.727] - Solver: "W4 ->z3"
[22:13:16.727] - Versions: What4 v1.3-29-g6c462cd using qfnia, Z3 4.8.7 - 64 bit
[22:13:16.727] - Last used: 2023-07-25 22:13:09.484464 UTC

sawscript> print_solver_cache_stats
[22:13:20.585] == Solver result cache statistics ==
[22:13:20.585] - 2 results cached in example.cache
[22:13:20.585] - 2 insertions into the cache so far this run (0 failed attempts)
[22:13:20.585] - 1 usage of cached results so far this run (0 failed attempts)
~~~~

## Other External Provers

In addition to the built-in automated provers already discussed, SAW
Expand Down
Binary file modified doc/manual/manual.pdf
Binary file not shown.
28 changes: 28 additions & 0 deletions intTests/test_solver_cache/test.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
set -e

# Testing the basic features of the solver cache
SAW_SOLVER_CACHE_PATH="test_solver_cache.cache" $SAW test_basics.saw

# Testing the `set_solver_cache_path` command as well as re-using a cache file
$SAW test_path_and_reuse.saw

# Testing cleaning the solver cache
pip install cbor2 python-dateutil lmdb
python3 -i ../../saw-remote-api/python/saw_client/solver_cache.py << END
RyanGlScott marked this conversation as resolved.
Show resolved Hide resolved
cache = SolverCache("test_solver_cache.cache")
for k,v in cache.items():
if 'SBV' in v.solver_versions:
v.solver_versions['SBV'] = '[OLD VERSION]'
cache[k] = v

END
$SAW test_clean.saw

# Testing that the envionment variable only creates the cache file when needed
rm -rf test_solver_cache.cache
SAW_SOLVER_CACHE_PATH="test_solver_cache.cache" $SAW test_env_var.saw
if [ -d "test_solver_cache.cache" ]; then
echo "FAILURE: Cache file created from SAW_SOLVER_CACHE_PATH when not used"; exit 1
else
echo "SUCCESS: Cache file not created from SAW_SOLVER_CACHE_PATH when not used"
fi
38 changes: 38 additions & 0 deletions intTests/test_solver_cache/test_basics.saw
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
// Testing the basic features of solver result caching

// The solver cache starts out empty
test_solver_cache_stats 0 0 0 0 0;

// The cache should now have one entry and one insertion
prove_print z3 {{ \(x:[64]) -> x == x }};
test_solver_cache_stats 1 0 0 1 0;

// Testing that cached results do not depend on variable names - thus, the
// cache should now have one more usage, but not a new entry or insertion
prove_print z3 {{ \(new_name:[64]) -> new_name == new_name }};
test_solver_cache_stats 1 1 0 1 0;

// Testing that cached results depend on the backend used - thus, the cache
// should now have one more entry and one more insertion, but not a new usage
prove_print (w4_unint_z3 []) {{ \(x:[64]) -> x == x }};
test_solver_cache_stats 2 1 0 2 0;

// Testing that cached results depend on the options passed to the given
// backend - thus, the cache should now have one more entry and one more
// insertion, but not a new usage
prove_print (w4_unint_z3_using "qfnia" []) {{ \(x:[64]) -> x == x }};
test_solver_cache_stats 3 1 0 3 0;

// Same as the above but for sat results

fails (prove_print z3 {{ \(x:[64])(y:[64]) -> x == y }});
test_solver_cache_stats 4 1 0 4 0;

fails (prove_print z3 {{ \(new_name_1:[64])(new_name_2:[64]) -> new_name_1 == new_name_2 }});
test_solver_cache_stats 4 2 0 4 0;

fails (prove_print w4 {{ \(x:[64])(y:[64]) -> x == y }});
test_solver_cache_stats 5 2 0 5 0;

fails (prove_print (w4_unint_z3_using "qfnia" []) {{ \(x:[64])(y:[64]) -> x == y }});
test_solver_cache_stats 6 2 0 6 0;
16 changes: 16 additions & 0 deletions intTests/test_solver_cache/test_clean.saw
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
// Testing cleaning the solver cache

set_solver_cache_path "test_solver_cache.cache";

// The cache still has entries from prior runs
test_solver_cache_stats 6 0 0 0 0;

// After cleaning, all SBV entries should be removed (see test.sh)
clean_solver_cache;
test_solver_cache_stats 4 0 0 0 0;

// After running test_ops, we should have all the original entries back,
// as many insertions as there were SBV entries, and as many usages as there
// were in test_path_second less the number of SBV entries
include "test_ops.saw";
test_solver_cache_stats 6 6 0 2 0;
3 changes: 3 additions & 0 deletions intTests/test_solver_cache/test_env_var.saw
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
// Testing setting the solver cache path through an envionment variable

prove_print assume_valid {{ \(x:[64]) -> x == x }};
8 changes: 8 additions & 0 deletions intTests/test_solver_cache/test_ops.saw
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
prove_print z3 {{ \(x:[64]) -> x == x }};
prove_print z3 {{ \(new_name:[64]) -> new_name == new_name }};
prove_print (w4_unint_z3 []) {{ \(x:[64]) -> x == x }};
prove_print (w4_unint_z3_using "qfnia" []) {{ \(x:[64]) -> x == x }};
fails (prove_print z3 {{ \(x:[64])(y:[64]) -> x == y }});
fails (prove_print z3 {{ \(new_name_1:[64])(new_name_2:[64]) -> new_name_1 == new_name_2 }});
fails (prove_print w4 {{ \(x:[64])(y:[64]) -> x == y }});
fails (prove_print (w4_unint_z3_using "qfnia" []) {{ \(x:[64])(y:[64]) -> x == y }});
12 changes: 12 additions & 0 deletions intTests/test_solver_cache/test_path_and_reuse.saw
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
// Testing the `set_solver_cache_path` command as well as re-using a cache file

set_solver_cache_path "test_solver_cache.cache";

// The cache still has entries from the last run
test_solver_cache_stats 6 0 0 0 0;

// After running test_path_ops, we should have the same number of entries, but
// no insertions and and as many usages as there were insertions plus usages
// the first time
include "test_ops.saw";
test_solver_cache_stats 6 8 0 0 0;
12 changes: 12 additions & 0 deletions saw-core/src/Verifier/SAW/SATQuery.hs
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,12 @@ module Verifier.SAW.SATQuery
, SATResult(..)
, SATAssert(..)
, satQueryAsTerm
, satQueryAsPropTerm
) where

import Data.Map (Map)
import Data.Set (Set)
import Data.Foldable (foldrM)

import Verifier.SAW.Name
import Verifier.SAW.FiniteValue
Expand Down Expand Up @@ -87,3 +89,13 @@ satQueryAsTerm sc satq =
do x' <- scAnd sc x y
loop x' xs
loop _ (UniversalAssert{} : _) = univFail

-- | Compute the conjunction of all the assertions
-- in this SAT query as a single term of type Prop.
satQueryAsPropTerm :: SharedContext -> SATQuery -> IO Term
satQueryAsPropTerm sc satq =
scTupleType sc =<< mapM assertAsPropTerm (satAsserts satq)
where assertAsPropTerm (BoolAssert b) = scEqTrue sc b
assertAsPropTerm (UniversalAssert ecs hs g) =
scGeneralizeExts sc (map fst ecs) =<<
scEqTrue sc =<< foldrM (scImplies sc) g hs
5 changes: 5 additions & 0 deletions saw-remote-api/python/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,10 @@
# Revision history for saw-client

## 1.0.1 -- YYYY-MM-DD

* Add `solver_cache.py` implementing an interface for interacting with SAW
solver cache databases.

## 1.0.0 -- 2023-06-26

* The v1.0.0 release is made in tandem with the SAW 1.0 release. See the
Expand Down
Loading
Loading