HolSmt: implement div and mod, fix proof replay, fix translation #1210

someplaceguy · 2024-03-12T13:36:14Z

Hi!

This PR implements support for DIV, MOD, $/, $%, quot and rem, i.e. all div and mod operators for num and int. It also enables proof replay for all the self-tests that contained these operators. No changes were made outside HolSmt.

In total, 78 more self-tests are now passing (with Z3 v4 proof replay). In fact, I believe all arithmetic test cases are now passing, including with proof replay, with the exception of the word tests (and one which I added and mention below).

The support for the div and mod operators were implemented using the following approach:

First, I defined two functions in HolSmtTheory called smtdiv and smtmod, which implement the semantics of SMT-LIB's integer division and modulo operators.
- These two operators have different definitions than all of HOL4's div and mod operators; however smtdiv was defined based on $/ and smtmod based on $% for simplicity.
A test case was added to selftests.sml which verifies that smtdiv and smtmod exactly match the SMT-LIB definition.
- Initially, I was planning on manually proving this theorem, but it appears to be quite laborious to do so.
- However, as always, Z3 can prove this easily (hence why this was proved in selftests.sml).
- Unfortunately, we can't reconstruct this proof (because one of the proof steps seems to rely on nonlinear arithmetic), so the proof is not really meaningful...
Theorems were added to HolSmtTheory which define HOL4's div and mod operators in terms of smtdiv and smtmod (these are easier to prove). These theorems are now added to the proving context by default, so that SMT solvers can translate HOL4's div and mod operators into the equivalent SMT-LIB div and mod operators which they natively support.
SMT-LIB translation and parser support were added for smtdiv and smtmod.
Support was added to the linear arithmetic proof handlers so that they can prove goals that have smtdiv and smtmod (by rewriting with their definitions into equivalent HOL4 $/ and $% operations).

This PR also implements the following:

Fixed the translation of let ... and ... in ..., i.e. the one with parallel definitions, into the SMT-LIB language. This translation had never actually worked before, only let ... ; ... in ... was working properly.
Fixed slowness (more like an apparent hang) due to the workaround for "intLib.ARITH_PROVE 0n = x * Num 0i" fails unexpectedly #1203. Apparently, COOPER_PROVE is much slower than ARITH_PROVE in some cases, so now we try the latter before falling back to the former.
Major improvements in the arithmetic proving procedures:
- Implemented a workaround for intLib.ARITH_PROVE raises exception NotFound #1209.
- Implemented some workarounds for intLib.{ARITH,COOPER}_PROVE can't prove certain goals #1207.
- The generalize_ite procedure was removed, as it seems to be unnecessary and was causing some arithmetic proof steps to fail.
- As mentioned above, added support for smtdiv and smtmod.
- Since both the th_lemma_arith and rewrite arithmetic proof handlers need the above fixes, these arithmetic proving procedures now use a common implementation.
Implemented a workaround for a Z3 bug where it adds an hypothesis but then forgets to discharge it.
- Fortunately, such hypotheses always seem to be of the form P = P, so it's easy to discharge them ourselves.
- However, I am planning to file a Z3 bug (or even a PR) once I get an opportunity to debug this issue.
Implemented a handler for the iff-false Z3 proof inference rule.

I'm happy with the results, although I was a bit surprised by how hard it was to fix proof replay for these tests. I was expecting Z3 to throw more curve-balls, but this time Z3 did not give me much trouble.

Instead, the major problems were the limitations in the HOL4 integer arithmetic decision procedures, especially #1207. I implemented some workarounds (i.e. I try to prove the goals using some additional tactics, including arithmetic simplifications) but I don't expect these to be reliable and in fact, if I change some theorems around a bit I do encounter some linear arithmetic steps which HOL4 cannot prove automatically even with my workarounds.

Anyway, I think this concludes my work for the near future, as the next issues to tackle are quite hard to solve.

Thanks!

cc @tjark

The SMT-LIB term translator couldn't handle parallel definitions properly. This commit fixes that.

The previous workaround for ARITH_PROVE failing to prove some terms changed the `th_lemma_arith` proof handler to use COOPER_PROVE instead. However, COOPER_PROVE sometimes is too slow and appears to hang when replaying some Z3 proofs. This commit changes the `th_lemma_arith` handler to first try ARITH_PROVE which is much faster, and only if it fails, then try COOPER_PROVE.

SMT_NUM_ADD and SMT_NUM_SUB were meant to be used in place of INT_ADD and INT_NUM_SUB for optimization reasons. However, this causes a regression in proof replay in one of the self-tests. Since this regression is not straightforward to fix, these theorems are not being used yet. NUM_OF_INT was also added to the default set of theorems since it helps solve some `num`-related goals.

Unfortunately, these operators don't seem to be documented. However, from a quick reading of the Z3 source code, it seems that they seem to be similar to `div` and `mod`, but they seem to be used to indicate that Z3 can't prove that the divisor is non-zero.

The `th_lemma_arith` was running into several issues. The following changes were made: 1. The `generalize_ite` procedure was removed, since the arithmetic decision procedures can handle `if ... then ... else ...` now. It was causing replay failures because some terms were not provable unless the decision procedures could inspect the `if` bodies. 2. The arithmetic decision procedures don't understand `smtdiv` and `smtmod`; so before proving the term, we now rewrite it with the definitions of these functions. 3. A workaround was also implemented for some instances of the following issue: HOL-Theorem-Prover#1207 Namely, the integer decision procedures cannot decide some terms containing `ABS` and `Num`, so we rewrite them when possible.

This commit contains a workaround for the following issue: HOL-Theorem-Prover#1209

In some proof certificates, Z3 emits a `hypothesis` inference rule without discharging it later with a `lemma` rule. This causes some proofs to fail at the end of the proof replay procedure, because they end up containing an extra hypothesis. Fortunately, so far all such undischarged hypothesis rules were of the form `p = p`, so for now we can easily work around this issue by creating such theorems without adding an assumption. This workaround allows us to enable proof replay for 5 more self-tests.

Since these two decision procedures need to handle the same kind of terms (including smtdiv / smtmod rewrites), this commit merges them.

This commit adds some tactics which help prove some arithmetic proof steps that intLib.{ARITH,COOPER}_TAC can't deal with. These are basically some workarounds for: HOL-Theorem-Prover#1207 This allows us to enable proof replay for the remaining arithmetic test cases that didn't already have proof replay enabled, with the exception of the very last test case (because we can't replay nonlinear arithmetic proof steps). That said, I don't expect these workarounds to be very reliable. Ideally, it would be better to improve HOL4's linear arithmetic decision procedures.

someplaceguy · 2024-03-12T14:56:34Z

Added one commit to update the HolSmt documentation regarding recent improvements.

mn200 · 2024-03-13T00:04:09Z

Can you put the Euclidean div and mod constants (perhaps call them ediv and emod) into integerScript.sml please?

someplaceguy · 2024-03-13T12:31:10Z

Can you put the Euclidean div and mod constants (perhaps call them ediv and emod) into integerScript.sml please?

Sure, but how would you like me to do that? Just add those definitions to integerScript.sml as they are (except for the renaming)? Anything else?

For reference, I took a look at how int_quot and int_rem are defined and there seems to be a lot going on there, e.g.:

int_quot is actually defined based on the natural numbers (instead of being defined based on int_div like I did). This would be easy enough to do.
int_quot (and int_div) are constant specifications instead of constant definitions (which require proving a theorem, although that seems quite doable). Should I use constant specifications as well?
Infix constants are also defined, although this seems trivial.
Some quot-related theorems are proved. I guess I could try to prove similar ones, although it would take me some time... However, INT_QUOT_UNIQUE seems incredibly daunting to prove...
rem also has quite a few theorems and a few daunting ones as well...
Furthermore, these two operators also have theorems used for rewrites. I think I understand the *_REDUCE theorems but not the INT_{QUOT,REM}_CALCULATE ones.
These operators also have syntax manipulation functions defined in intSyntax.{sig,sml} (this seems trivial to add).
For completion I assume simplification rules are added somewhere, although I can't figure out where?
And then presumably these operators have support in the omega and cooper decision procedures, although I can't even find the string "quot" in there? (And even if I did, I doubt I'd be able to figure out how to add support for ediv and emod there...).

So, which of the above should I do?

binghe · 2024-03-13T23:01:55Z

Can you put the Euclidean div and mod constants (perhaps call them ediv and emod) into integerScript.sml please?

Sure, but how would you like me to do that? Just add those definitions to integerScript.sml as they are (except for the renaming)? Anything else?

For reference, I took a look at how int_quot and int_rem are defined and there seems to be a lot going on there, e.g.:

int_quot is actually defined based on the natural numbers (instead of being defined based on int_div like I did). This would be easy enough to do.

int_quot (and int_div) are constant specifications instead of constant definitions (which require proving a theorem, although that seems quite doable). Should I use constant specifications as well?

Infix constants are also defined, although this seems trivial.

Some quot-related theorems are proved. I guess I could try to prove similar ones, although it would take me some time... However, INT_QUOT_UNIQUE seems incredibly daunting to prove...

rem also has quite a few theorems and a few daunting ones as well...

Furthermore, these two operators also have theorems used for rewrites. I think I understand the *_REDUCE theorems but not the INT_{QUOT,REM}_CALCULATE ones.

These operators also have syntax manipulation functions defined in intSyntax.{sig,sml} (this seems trivial to add).

For completion I assume simplification rules are added somewhere, although I can't figure out where?

And then presumably these operators have support in the omega and cooper decision procedures, although I can't even find the string "quot" in there? (And even if I did, I doubt I'd be able to figure out how to add support for ediv and emod there...).

So, which of the above should I do?

I think @mn200 simply meant that you can move the definitions of smtdiv and smtmod (renamed to ediv and emod) and related theorems (from SMT_NUM_ADD to SMT_INT_REM), i.e. all you added into HolSmtScript.sml, to integerScript.sml. In this way, the definitions and theorems can be reused by potential other users in the future.

mn200 · 2024-03-14T00:00:22Z

@binghe has it right; what you have already is a good starting point for future work, and it would be better to have that work happen in integerTheory.

The previously-implemented workaround might have an issue where if Z3 introduces an hypothesis of the form `P = P` but then *doesn't* forget to discharge it, we might fail to replay the `lemma` proof rule because we avoided adding an assumption. Instead, this commit fixes the workaround so that we remove the extra hypothesis at the end of the proof replay procedure, i.e. only when we are sure that Z3 *did* forget to discharge the hypothesis.

Now that integerTheory contains the definitions for Euclidean div and mod, let's use that instead of redefining these operators in HolSmtTheory.

Now that HOL-Theorem-Prover#1209 is fixed, we can remove the workaround for it in HolSmt.

HOL4's integer arithmetic decision procedures cannot handle `quot` and `rem`, so proof steps containing these symbols were failing to replay. This commit rewrites goals containing these symbols to increase the chances of a successful replay. `thm_AUTO` was similarly improved, as it had the same limitation and was failing to prove some of the self-tests.

someplaceguy · 2024-03-14T17:40:55Z

Added 7 more commits:

Fix a possible bug in a previously implemented workaround for a Z3 bug where a hypothesis is introduced but not discharged. Instead of avoiding to introduce the assumption, now HolSmt will get rid of it at the end of the proof replay procedure. This should make proof replay more robust, as it is possible that Z3 might introduce a hypothesis of the form P = P and then not forget to discharge it, in which case lemma might require the hypothesis to be there.
Moved the Euclidean div and mod definitions to integerTheory, as well as the theorems that had been added to HolSmt, as requested by @binghe and @mn200.
Changed HolSmt to use integerTheory's definition of ediv and emod, as well as the theorems that were moved there, thus removing the definitions and theorems from HolSmt.
Removed workaround for intLib.ARITH_PROVE raises exception NotFound #1209 as it is now fixed (thank you @mn200!)
Fixed replay of goals containing quot and rem due to intLib.{ARITH,COOPER}_PROVE can't prove certain goals #1207.
Add 73 new tests exercising quot and rem, all with Z3 proof replay enabled. Perplexingly, I had forgotten to add these.
Add some simps to thm_AUTO, which help make advances in proving the last self-test but do not yet allow it to succeed.

Note: I wasn't sure what to name the newly added definitions and theorems in integerTheory, nor where to put them exactly, so let me know if you'd like me to make any further changes.

Thanks!

In case `rewrite` proof rules ever come to have indices, let's ignore them instead of erroring out.

someplaceguy · 2024-03-15T12:05:54Z

Added one more commit which ignores indices in rewrite proof rules instead of failing to parse the proof certificates with an error. This way we don't fail to parse proof certificates in case e.g. a future version of Z3 adds an index to these proof rules.

someplaceguy · 2024-03-17T12:07:24Z

Added one more commit which partially reverts a previous one, which had added a change that was actually unnecessary.

I realized this was unnecessary after thinking about it (and realizing it didn't make much sense), however I don't understand why I introduced that change in the first place, since I cannot reproduce those test failures and nothing relevant was changed in the mean time...

It should not be necessary to rewrite these symbols with their definitions, because from the perspective of SMT solvers, these should just be functions like any others. Therefore, when replaying arithmetic proof steps, the proof handler should not need to understand what `quot` and `rem` are. This is a partial revert of the following commit: 6220a17 It's not clear to me why I introduced this change in the first place, since the tests do work fine without it. Also, there doesn't seem to be any performance regression.

Add support for parsing `declare-const` commands to the SMT-LIB parser.

The SMT-LIB file parser was ending up raising an end-of-stream exception unless the `(exit)` command was present in the SMT-LIB file, even though the `(exit)` command does not seem to be mandatory according to the SMT-LIB v2.6 standard.

someplaceguy · 2024-03-18T17:23:37Z

Added 2 more commits, which fix issues when parsing SMT-LIB commands.

This part of the parser is used only for importing SMT-LIB problems into HOL4, which is useful for testing HolSmt but is not the main use case of HolSmt.

This commit fixes a parsing error when parsing arithmetic `th-lemma` inference rules with an `assign-bounds` parameter, i.e.: ((_ th-lemma arith assign-bounds ...) ...) The error was due to `assign-bounds` not being in the known term dictionary.

someplaceguy · 2024-03-18T17:43:22Z

Added one more commit with a simple fix, this time fixing an error when parsing some Z3 proof certificates.

Now that all known issues were fixed wrt. ARITH_PROVE not proving goals that COOPER_PROVE could, we can remove this workaround, thus making proof replay faster in some cases.

someplaceguy · 2024-03-25T14:35:48Z

Added one commit to remove the workaround for #1203, since it's no longer necessary. Thanks @mn200!

mn200 · 2024-04-09T02:30:21Z

Thanks for all of this work!

someplaceguy added 13 commits March 12, 2024 12:40

HolSmt: fix let ... and ... SMT-LIB translation

ac82358

The SMT-LIB term translator couldn't handle parallel definitions properly. This commit fixes that.

HolSmt: add support for div and mod

1c03152

HolSmt: enable DIV tests

b6330a8

HolSmt: implement iff-false Z3 proof rule

62e89d7

HolSmt: enable the remaining div and mod tests

064d497

HolSmt: work around NotFound exn in ARITH_PROVE

85cde15

This commit contains a workaround for the following issue: HOL-Theorem-Prover#1209

HolSmt: merge z3_lemma_arith and z3_rewrite's arith

5db3618

Since these two decision procedures need to handle the same kind of terms (including smtdiv / smtmod rewrites), this commit merges them.

someplaceguy force-pushed the u/divmod branch from 84672bf to 351fe74 Compare March 12, 2024 14:03

HolSmt: update documentation

898d8f5

someplaceguy added 7 commits March 14, 2024 02:24

Add Euclidean div and mod, plus some num/int theorems

d72e1ed

HolSmt: use integerTheory's ediv and emod

064412b

Now that integerTheory contains the definitions for Euclidean div and mod, let's use that instead of redefining these operators in HolSmtTheory.

HolSmt: remove workaround for NotFound exn in ARITH_TAC

caecc7a

Now that HOL-Theorem-Prover#1209 is fixed, we can remove the workaround for it in HolSmt.

HolSmt: add full battery of quot and rem tests

e14320c

HolSmt: add let, ediv and emod simps to thm_AUTO

4047092

HolSmt: ignore indices in rewrite proof rules

9d9dfa2

In case `rewrite` proof rules ever come to have indices, let's ignore them instead of erroring out.

someplaceguy force-pushed the u/divmod branch from 9d0d002 to 60269d4 Compare March 17, 2024 12:09

someplaceguy added 2 commits March 18, 2024 17:20

HolSmt: parse declare-const commands

044e6ad

Add support for parsing `declare-const` commands to the SMT-LIB parser.

HolSmt: remove workaround for HOL-Theorem-Prover#1203

d8c49f8

Now that all known issues were fixed wrt. ARITH_PROVE not proving goals that COOPER_PROVE could, we can remove this workaround, thus making proof replay faster in some cases.

mn200 merged commit 84f2039 into HOL-Theorem-Prover:develop Apr 9, 2024
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HolSmt: implement div and mod, fix proof replay, fix translation #1210

HolSmt: implement div and mod, fix proof replay, fix translation #1210

someplaceguy commented Mar 12, 2024 •

edited

Loading

someplaceguy commented Mar 12, 2024

mn200 commented Mar 13, 2024

someplaceguy commented Mar 13, 2024 •

edited

Loading

binghe commented Mar 13, 2024

mn200 commented Mar 14, 2024

someplaceguy commented Mar 14, 2024 •

edited

Loading

someplaceguy commented Mar 15, 2024

someplaceguy commented Mar 17, 2024

someplaceguy commented Mar 18, 2024

someplaceguy commented Mar 18, 2024

someplaceguy commented Mar 25, 2024 •

edited

Loading

mn200 commented Apr 9, 2024

HolSmt: implement div and mod, fix proof replay, fix translation #1210

HolSmt: implement div and mod, fix proof replay, fix translation #1210

Conversation

someplaceguy commented Mar 12, 2024 • edited Loading

someplaceguy commented Mar 12, 2024

mn200 commented Mar 13, 2024

someplaceguy commented Mar 13, 2024 • edited Loading

binghe commented Mar 13, 2024

mn200 commented Mar 14, 2024

someplaceguy commented Mar 14, 2024 • edited Loading

someplaceguy commented Mar 15, 2024

someplaceguy commented Mar 17, 2024

someplaceguy commented Mar 18, 2024

someplaceguy commented Mar 18, 2024

someplaceguy commented Mar 25, 2024 • edited Loading

mn200 commented Apr 9, 2024

someplaceguy commented Mar 12, 2024 •

edited

Loading

someplaceguy commented Mar 13, 2024 •

edited

Loading

someplaceguy commented Mar 14, 2024 •

edited

Loading

someplaceguy commented Mar 25, 2024 •

edited

Loading