Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proper SMT CHC tests #524

Closed
wants to merge 9 commits into from
Closed

Proper SMT CHC tests #524

wants to merge 9 commits into from

Conversation

leonardoalt
Copy link
Member

@leonardoalt leonardoalt commented May 20, 2021

Depends on ethereum/solidity#11421

This PR adds proper SMTChecker tests. Currently, the SMTChecker tests are actually broken, many tests are ignored, and the smtlib version of the SMTCheckre encoding is not being tests at all.

The main smtCheckerTests in this code are mainly used by the Solidity t_ems_solcjs job, which copies the smtCheckerTests into the solc-js tests and runs test/smtcallback.js from this repo. It now requires soljson version >=0.7.2 because of the warning format.

If soljson < 0.8.5, it runs the static z3 inside soljson and compares that to the test expectation.
Else, it runs /\ plus a local z3 binary using Solidity's smtlib2 encoding, and compares all of them.
The comparison now is much more precise than before: it compares each output location, which represents a property.

Note that safe properties are not reported, so if one solver reports a location and another doesn't, there's an inconsistency. The exception to this is when this inconsistency happens for the BMC engine, where one solver probably solved it safe with CHC, and another solver couldn't prove it safe with CHC, and reports an unsafe false positive via BMC.

The test only fails if there's a real inconsistency, as opposed to the stronger soltest. I think that's fair since the JS/smtlib2 environment might give more nondeterministic answers compared to the embedded C++ z3.

@coveralls
Copy link

coveralls commented May 20, 2021

Coverage Status

Coverage decreased (-5.6%) to 80.252% when pulling 91068be on smt_proper_tests into 1a457d5 on master.

function expectErrors (expectations, errors, ignoreCex) {
if (errors.length !== expectations.length) {
return false;
function collectFiles (testdir) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just extracted from the main loop

return sources;
}

function createTests (sources, st) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extracted from the main loop

return solOutput.errors.filter(error => !error.message.includes('This is a pre-release')).map(error => error.message);
}

function expectStringErrors (a, b) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These two functions have the same behavior as before and are only used by the first test.

@leonardoalt
Copy link
Member Author

I really don't understand this coverage tool.

@leonardoalt leonardoalt marked this pull request as ready for review May 21, 2021 13:48
@leonardoalt leonardoalt requested a review from axic May 21, 2021 13:54
@leonardoalt leonardoalt force-pushed the smt_proper_tests branch 2 times, most recently from bd4d84d to f2b9ac3 Compare May 21, 2021 15:10
@cameel
Copy link
Member

cameel commented Sep 3, 2021

@leonardoalt Is this still work in progress or is it just waiting to be reviewed?

@leonardoalt
Copy link
Member Author

Waiting to be reviewed

smtsolver.js Outdated

var potentialSolvers = [
{
name: 'z3',
params: '-smt2 rlimit=20000000 rewriter.pull_cheap_ite=true fp.spacer.q3.use_qgen=true fp.spacer.mbqi=false fp.spacer.ground_pobs=false'
params: '-smt2 timeout=' + timeout + ' rewriter.pull_cheap_ite=true fp.spacer.q3.use_qgen=true fp.spacer.mbqi=false fp.spacer.ground_pobs=false'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why time limit and not resource limit?

Ah this is just so that both cvc4/z3 have the same limit.

@@ -44,7 +45,9 @@ function solve (query) {
if (
!solverOutput.startsWith('sat') &&
!solverOutput.startsWith('unsat') &&
!solverOutput.startsWith('unknown')
!solverOutput.startsWith('unknown') &&
!solverOutput.startsWith('(error') &&
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What outputs (?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, Eldarica outputs errors in some sort of smtlib sexpr, like (error something failed lalala) which is caught here

Copy link
Member

@axic axic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs rebase.

@axic
Copy link
Member

axic commented Jan 19, 2022

I suppose #565 needs to be merged first?

@leonardoalt
Copy link
Member Author

Ah yes, this branch is all over the place. Will need to rethink it after the other one gets merged

@axic axic marked this pull request as draft January 19, 2022 21:06
@leonardoalt
Copy link
Member Author

Closing cus need to rewrite it anyway.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Status: Review
Development

Successfully merging this pull request may close these issues.

4 participants