Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected performance divergence on identical inputs between macOS and Linux #7457

Closed
rod-chapman opened this issue Nov 19, 2024 · 14 comments
Closed

Comments

@rod-chapman
Copy link

Running Z3 4.13.3 on Apple Silicon macOS, and various Linux platforms, I see divergence in behaviour and a sharp performance penalty on macOS.

Input file is the same.
Z3 is 4.13.3 on all platforms.
Parameters (particularly all random seeds) are at their default values and identical on all platforms.

This causes pain for us, since our main development is done on macOS, but our CI runs are done on Linux.

I will post the inout file and result in following comments.

@rod-chapman
Copy link
Author

Here is the input SMT2 file. (This was generated by CBMC 6.4.0, but that shouldn't matter.)
p.smt2.gz

@rod-chapman
Copy link
Author

Here is the result of running

z3 -st -v:3 p.smt2

on a x86_64 (EC2 c7i) machine, running Ubuntu 24.04:

$ z3 -st -v:3 p.smt2
(smt.searching)
(smt.stats :restarts     :decisions :clauses/bin/units           :simplify  :memory)
(smt.stats        :conflicts    :propagations            :lemmas      :deletions   )
(smt.stats    0      0      0      0 179714/85520/1289       0/0     0    0  101.08)
(smt.stats    0    101    792 327819 190693/85520/22913      98/9     2    8  108.90)
(smt.stats    1    478   2001 615122 179710/85520/22916     468/11     3   12  109.09)
(smt.stats    2    816  36603 1012441 179803/85582/22932     739/14     6   64  117.43)
(smt.stats    3   1234 146833 1993594 150901/86330/41382    1088/59    14 35987  119.68)
(smt.searching)
(smt.stats :restarts     :decisions :clauses/bin/units   :simplify  :memory)
(smt.stats        :conflicts    :propagations    :lemmas      :deletions   )
(smt.stats    0      0      0      0   334/0/0       0/0     0    0  123.19)
(smt.stats    0     12     62    478   254/0/0      11/6     1   86  123.19)
(smt.searching)
(smt.stats    0     12     62    478   249/0/0      11/6     1   86  124.28)
(smt.stats    0     13     69    548   134/0/0      10/5     2  208  124.28)
(smt.stats    3   1234 146833 1993594 150901/86330/41382    1088/59    14 35987  124.28)
(smt.working :conflicts 3498 :agility 0.0654953)
(smt.stats    4   3634 192314 2157797 154630/86459/41407    3486/65    15 35988  127.31)
(smt.working :conflicts 4634 :agility 0.0493627)
(smt.delete-inactive-clauses  :num-deleted-clauses 809)
(smt.stats    5   5337 199883 2862832 152369/87223/41442    4367/90    17 36141  128.18)
(smt.working :conflicts 6337 :agility 0.249308)
(smt.working :conflicts 7338 :agility 0.00291438)
(smt.stats :restarts :decisions    :clauses/bin/units              :simplify    :memory)
(smt.stats     :conflicts  :propagations                :lemmas       :deletions       )
(smt.stats    6   7891 220139 5561825 153169/87531/41442    6915/91    18 36148  129.06)
(smt.working :conflicts 8891 :agility 0.116243)
(smt.working :conflicts 9892 :agility 0.162354)
(smt.delete-inactive-clauses  :num-deleted-clauses 340)
(smt.working :conflicts 10893 :agility 0.0150338)
unsat
(:added-eqs               346704
 :array-ax1               6
 :array-ax2               9894
 :binary-propagations     8236913
 :bv-bit2core             537605
 :bv-diseqs               202
 :bv-dynamic-diseqs       184
 :bv-dynamic-eqs          769
 :bv->core-eq             228720
 :conflicts               11514
 :datatype-accessor-ax    6
 :datatype-constructor-ax 5
 :decisions               642733
 :del-clause              4018044
 :elim-unconstrained      7
 :final-checks            1
 :max-generation          2
 :max-memory              136.66
 :memory                  66.51
 :minimized-lits          198372
 :mk-bool-var             1849595
 :mk-clause               4087719
 :mk-clause-binary        88897
 :num-allocs              1717131536
 :num-checks              1
 :propagations            17840115
 :quant-instantiations    25869
 :restarts                7
 :rlimit-count            34130376
 :solve-eqs-elim-vars     618
 :solve-eqs-steps         1001
 :time                    5.14
 :total-time              5.22)

@rod-chapman
Copy link
Author

Here is the same on Graviton3/Amazon Linux:

$ z3 -st -v:3 p.smt2
(smt.searching)
(smt.stats :restarts     :decisions :clauses/bin/units           :simplify  :memory)
(smt.stats        :conflicts    :propagations            :lemmas      :deletions   )
(smt.stats    0      0      0      0 179668/85474/1289       0/0     0    0  101.02)
(smt.stats    0    101  50459 129140 192428/85474/23086      97/20     1    8  117.81)
(smt.stats    1    192  69861 279910 183164/86006/23087     183/21     2  379  118.52)
(smt.searching)
(smt.stats :restarts     :decisions :clauses/bin/units   :simplify  :memory)
(smt.stats        :conflicts    :propagations    :lemmas      :deletions   )
(smt.stats    0      0      0      0   307/0/0       0/0     0    0  122.00)
(smt.stats    0     11     31    325   227/0/0      11/6     1   80  122.00)
(smt.searching)
(smt.stats    0     11     31    325   228/0/0      11/6     1   80  123.09)
(smt.stats    0     14     41    463     9/0/0       0/0     2  310  123.09)
(smt.stats    1    192  69861 279910 183164/86006/23087     183/21     2  379  123.09)
(smt.stats    2    343  71517 688342 183548/86317/23318     327/34     4  379  123.29)
(smt.stats    3    569  99158 1222648 158273/86439/41546     510/78     8 36144  123.75)
(smt.stats    4    907 153478 1974034 157134/86813/42566     821/104    12 38101  125.05)
(smt.stats    5   1413 200325 2772417 145535/88127/42665    1243/115    18 38329  127.41)
(smt.stats :restarts :decisions    :clauses/bin/units               :simplify    :memory)
(smt.stats     :conflicts  :propagations                :lemmas        :deletions       )
(smt.stats    6   2171 523778 4258835 180718/88541/43086    1619/136    28 39274  131.34)
(smt.working :conflicts 3171 :agility 0.0903311)
(smt.stats    7   3307 561997 4616992 154405/88786/43095    2721/146    30 39302  132.01)
(smt.working :conflicts 4307 :agility 0.274754)
(smt.delete-inactive-clauses  :num-deleted-clauses 262)
unsat
(:added-eqs               259731
 :array-ax1               6
 :array-ax2               3497
 :binary-propagations     4686113
 :bv-bit2core             518002
 :bv-diseqs               204
 :bv-dynamic-diseqs       90
 :bv-dynamic-eqs          517
 :bv->core-eq             164286
 :conflicts               5368
 :datatype-accessor-ax    6
 :datatype-constructor-ax 5
 :decisions               569888
 :del-clause              2619454
 :elim-unconstrained      7
 :final-checks            1
 :max-generation          2
 :max-memory              132.77
 :memory                  66.39
 :minimized-lits          77222
 :mk-bool-var             1247433
 :mk-clause               2681159
 :mk-clause-binary        88906
 :num-allocs              1331720387
 :num-checks              1
 :propagations            10013527
 :quant-instantiations    17188
 :restarts                8
 :rlimit-count            25178340
 :solve-eqs-elim-vars     618
 :solve-eqs-steps         1001
 :time                    5.92
 :total-time              6.05)

Note similar times and counts, but not identical.

@rod-chapman
Copy link
Author

Here is the result of macOS:

% z3 -st -v:3 p.smt2              
(smt.searching)
(smt.stats :restarts     :decisions :clauses/bin/units           :simplify  :memory)
(smt.stats        :conflicts    :propagations            :lemmas      :deletions   )
(smt.stats    0      0      0      0 179667/85473/1289       0/0     0    0  102.04)
(smt.stats    0    101    111  86322 189475/85473/22910     101/3     1    8  109.47)
(smt.working :conflicts 3080 :agility 0.794001)
(smt.working :conflicts 4216 :agility 0.771673)
(smt.delete-inactive-clauses  :num-deleted-clauses 0)
(smt.working :conflicts 5919 :agility 0.786474)
(smt.working :conflicts 6920 :agility 0.786899)
(smt.working :conflicts 8473 :agility 0.785214)
(smt.working :conflicts 9474 :agility 0.831008)
(smt.delete-inactive-clauses  :num-deleted-clauses 0)
(smt.working :conflicts 10475 :agility 0.734887)
(smt.working :conflicts 12303 :agility 0.732449)
(smt.working :conflicts 13304 :agility 0.61909)
(smt.working :conflicts 14305 :agility 0.712033)
(smt.delete-inactive-clauses  :num-deleted-clauses 0)
(smt.working :conflicts 15306 :agility 0.79262)
(smt.working :conflicts 16307 :agility 0.786681)
(smt.working :conflicts 18047 :agility 0.70489)
(smt.working :conflicts 19048 :agility 0.710255)
(smt.delete-inactive-clauses  :num-deleted-clauses 13981)
(smt.working :conflicts 20049 :agility 0.716172)
(smt.working :conflicts 21050 :agility 0.750881)
(smt.working :conflicts 22051 :agility 0.722805)
(smt.working :conflicts 23052 :agility 0.731096)
(smt.working :conflicts 24053 :agility 0.749196)
(smt.delete-inactive-clauses  :num-deleted-clauses 0)
(smt.working :conflicts 25054 :agility 0.753066)
(smt.working :conflicts 26662 :agility 0.69029)
(smt.working :conflicts 27663 :agility 0.726753)
(smt.working :conflicts 28664 :agility 0.725181)
(smt.working :conflicts 29665 :agility 0.734521)
(smt.delete-inactive-clauses  :num-deleted-clauses 0)
(smt.working :conflicts 30666 :agility 0.669789)
(smt.working :conflicts 31667 :agility 0.775067)
(smt.working :conflicts 32668 :agility 0.744907)
(smt.working :conflicts 33669 :agility 0.810971)
(smt.working :conflicts 34670 :agility 0.672849)
(smt.delete-inactive-clauses  :num-deleted-clauses 0)
(smt.working :conflicts 35671 :agility 0.689312)
(smt.working :conflicts 36672 :agility 0.698483)
(smt.working :conflicts 37673 :agility 0.720017)
(smt.working :conflicts 39584 :agility 0.804783)
(smt.delete-inactive-clauses  :num-deleted-clauses 0)
(smt.working :conflicts 40585 :agility 0.686093)
(smt.working :conflicts 41586 :agility 0.736974)
(smt.working :conflicts 42587 :agility 0.693841)
(smt.working :conflicts 43588 :agility 0.786795)
(smt.working :conflicts 44589 :agility 0.694781)
(smt.delete-inactive-clauses  :num-deleted-clauses 0)
(smt.working :conflicts 45590 :agility 0.801162)
(smt.working :conflicts 46591 :agility 0.611343)
(smt.working :conflicts 47592 :agility 0.722808)
(smt.working :conflicts 48593 :agility 0.725507)
(smt.working :conflicts 49594 :agility 0.727637)
(smt.delete-inactive-clauses  :num-deleted-clauses 0)
(smt.working :conflicts 50595 :agility 0.503988)
(smt.working :conflicts 51596 :agility 0.53262)
(smt.working :conflicts 52597 :agility 0.577048)
(smt.working :conflicts 53598 :agility 0.668867)
(smt.working :conflicts 54599 :agility 0.557091)
(smt.delete-inactive-clauses  :num-deleted-clauses 0)
(smt.working :conflicts 55600 :agility 0.58278)
(smt.working :conflicts 56601 :agility 0.674257)
(smt.working :conflicts 57602 :agility 0.607007)
(smt.working :conflicts 58966 :agility 0.667062)
(smt.working :conflicts 59967 :agility 0.779397)
(smt.delete-inactive-clauses  :num-deleted-clauses 38545)
(smt.working :conflicts 60968 :agility 0.69406)
(smt.working :conflicts 61969 :agility 0.799816)
(smt.working :conflicts 62970 :agility 0.54312)
(smt.working :conflicts 63971 :agility 0.636905)
(smt.working :conflicts 64972 :agility 0.70691)
(smt.delete-inactive-clauses  :num-deleted-clauses 0)
(smt.working :conflicts 65973 :agility 0.635898)
(smt.working :conflicts 66974 :agility 0.713508)
(smt.working :conflicts 67975 :agility 0.747895)
(smt.working :conflicts 68976 :agility 0.642133)
(smt.working :conflicts 69977 :agility 0.800571)
(smt.delete-inactive-clauses  :num-deleted-clauses 0)
(smt.working :conflicts 70978 :agility 0.697154)
(smt.working :conflicts 71979 :agility 0.770341)
(smt.working :conflicts 72980 :agility 0.711087)
(smt.working :conflicts 73981 :agility 0.766209)
(smt.working :conflicts 74982 :agility 0.703345)
(smt.delete-inactive-clauses  :num-deleted-clauses 0)
(smt.working :conflicts 75983 :agility 0.676925)
(smt.working :conflicts 76984 :agility 0.796962)
(smt.working :conflicts 77985 :agility 0.830494)
(smt.working :conflicts 78986 :agility 0.7152)
(smt.working :conflicts 79987 :agility 0.806585)
(smt.delete-inactive-clauses  :num-deleted-clauses 3)
(smt.working :conflicts 80988 :agility 0.712342)
(smt.working :conflicts 81989 :agility 0.772377)
(smt.working :conflicts 82990 :agility 0.739637)
(smt.working :conflicts 83991 :agility 0.626813)
(smt.working :conflicts 84992 :agility 0.731371)
(smt.delete-inactive-clauses  :num-deleted-clauses 0)
(smt.working :conflicts 85993 :agility 0.757868)
(smt.working :conflicts 86994 :agility 0.755909)
(smt.working :conflicts 88038 :agility 0.761622)
(smt.working :conflicts 89039 :agility 0.837419)
(smt.delete-inactive-clauses  :num-deleted-clauses 0)
(smt.working :conflicts 90040 :agility 0.814532)
(smt.working :conflicts 91041 :agility 0.831237)
(smt.working :conflicts 92042 :agility 0.724282)
(smt.working :conflicts 93043 :agility 0.840824)
(smt.working :conflicts 94044 :agility 0.854452)
(smt.delete-inactive-clauses  :num-deleted-clauses 0)
(smt.working :conflicts 95045 :agility 0.91069)
(smt.working :conflicts 96046 :agility 0.731721)
(smt.working :conflicts 97047 :agility 0.905852)
(smt.working :conflicts 98048 :agility 0.852772)
(smt.working :conflicts 99049 :agility 0.900472)
(smt.delete-inactive-clauses  :num-deleted-clauses 34910)
(smt.working :conflicts 100050 :agility 0.514238)
(smt.working :conflicts 101051 :agility 0.546675)
(smt.working :conflicts 102052 :agility 0.543412)
(smt.working :conflicts 103053 :agility 0.597256)
(smt.working :conflicts 104054 :agility 0.623697)
(smt.delete-inactive-clauses  :num-deleted-clauses 0)
(smt.working :conflicts 105055 :agility 0.62609)
(smt.working :conflicts 106056 :agility 0.66263)
(smt.working :conflicts 107057 :agility 0.644847)
(smt.working :conflicts 108058 :agility 0.819234)
(smt.working :conflicts 109059 :agility 0.866376)
(smt.delete-inactive-clauses  :num-deleted-clauses 0)
(smt.working :conflicts 110060 :agility 0.767201)
(smt.working :conflicts 111061 :agility 0.804975)
(smt.working :conflicts 112062 :agility 0.671622)
(smt.working :conflicts 113063 :agility 0.824579)
(smt.working :conflicts 114064 :agility 0.729278)
(smt.delete-inactive-clauses  :num-deleted-clauses 0)
(smt.working :conflicts 115065 :agility 0.807862)
(smt.working :conflicts 116066 :agility 0.723416)
(smt.working :conflicts 117067 :agility 0.668074)
(smt.working :conflicts 118068 :agility 0.627707)
(smt.working :conflicts 119069 :agility 0.7009)
(smt.delete-inactive-clauses  :num-deleted-clauses 0)
(smt.working :conflicts 120070 :agility 0.670041)
(smt.working :conflicts 121071 :agility 0.848055)
(smt.working :conflicts 122072 :agility 0.789138)
(smt.working :conflicts 123073 :agility 0.857482)
(smt.working :conflicts 124074 :agility 0.550441)
(smt.delete-inactive-clauses  :num-deleted-clauses 0)
(smt.working :conflicts 125075 :agility 0.537444)
(smt.working :conflicts 126076 :agility 0.758856)
(smt.working :conflicts 127077 :agility 0.780121)
(smt.working :conflicts 128078 :agility 0.726748)
(smt.working :conflicts 129079 :agility 0.756206)
(smt.delete-inactive-clauses  :num-deleted-clauses 0)
(smt.working :conflicts 130080 :agility 0.663708)
(smt.working :conflicts 131645 :agility 0.808037)
(smt.working :conflicts 132646 :agility 0.0535712)
(smt.working :conflicts 133647 :agility 0.13076)
(smt.working :conflicts 134648 :agility 0.124053)
(smt.delete-inactive-clauses  :num-deleted-clauses 44359)
(smt.working :conflicts 135649 :agility 0.0980247)
(smt.stats    1 135664 349337 4139955 148416/86143/41458    3828/59     3 36218  668.58)
(smt.searching)
(smt.stats :restarts     :decisions :clauses/bin/units   :simplify  :memory)
(smt.stats        :conflicts    :propagations    :lemmas      :deletions   )
(smt.stats    0      0      0      0   328/0/0       0/0     0    0  672.16)
(smt.stats    0     13     33    453   259/0/0      11/6     1   84  672.16)
(smt.searching)
(smt.stats    0     13     33    453   247/0/0      11/6     1   84  673.30)
(smt.stats    0     13     39    509   247/0/0      11/6     1   84  673.30)
(smt.stats    1 135664 349337 4139955 148416/86143/41458    3828/59     3 36218  673.30)
(smt.working :conflicts 136664 :agility 0.14857)
(smt.working :conflicts 137665 :agility 0.0379178)
unsat
(:added-eqs               1785617
 :array-ax1               6
 :array-ax2               5295
 :binary-propagations     6502231
 :bv-bit2core             572203
 :bv-diseqs               118
 :bv-dynamic-diseqs       206
 :bv-dynamic-eqs          4912
 :bv->core-eq             1291822
 :conflicts               138474
 :datatype-accessor-ax    6
 :datatype-constructor-ax 5
 :decisions               786758
 :del-clause              3207447
 :elim-unconstrained      7
 :final-checks            1
 :max-generation          2
 :max-memory              858.08
 :memory                  76.91
 :minimized-lits          118315
 :mk-bool-var             2095606
 :mk-clause               3273224
 :mk-clause-binary        88206
 :num-allocs              33405221062.00
 :num-checks              1
 :propagations            14669129
 :quant-instantiations    21170
 :restarts                2
 :rlimit-count            36520592
 :solve-eqs-elim-vars     618
 :solve-eqs-steps         1001
 :time                    24.90
 :total-time              24.92)

Clearly something diverging significantly here...

@NikolajBjorner
Copy link
Contributor

In a nutshell it is due to C/C++ semantics.
With kids home for holidays I got access to a Mac.
When adjusting arguments to a function call here: 85d3041
the Mac version behaves the same as the other platforms.

I have yet to figure out why the old version diverged. Worth forcing.
But on another note. You should probably configure smt.relevancy=0 for quantifiers+bitvectors+arrays.

@NikolajBjorner
Copy link
Contributor

Debugging notes:

  1. Use z3 p.smt2 -v:10 trace=true, break after search starts.

  2. Save z3.log into z3.log.platform, where platform = mac or windows or linux etc.

  3. Run a simple comparison, such as:

with open("z3.log.windows", 'r') as in1:
   with open("z3.log.mac", 'r') as in2:
      lines1 = in1.readlines()
      lines2 = in2.readlines()
      i = 0
      diff = 0
      while True:
         line1 = lines1[i]
         line2 = lines2[i]
         if line1 != line2:
            print(i, line1)
            print(i, line2)
            diff += 1
            if diff > 5000:
               break
         i += 1

  1. Instrument z3 to break at the first difference.

@rod-chapman
Copy link
Author

Many thanks for the continued investigation.

What does smt.relevancy actually do? The "-pd" output doesn't shed much light on it. How would I know when to turn that on?

@NikolajBjorner
Copy link
Contributor

@NikolajBjorner
Copy link
Contributor

@rod-chapman
Copy link
Author

OK... as a "user" I do not expect to read any documentation that has the word "internals" in the title. Could you translate that back into English that a user could understand, and then be able to decide what switches to use, based solely on the properties of the source language that I'm dealing with (in this case, SPARK Ada, but it could equally be C) ?

@NikolajBjorner
Copy link
Contributor

I expect that once you start using bit-vectors with "lame" quantifier usage, then set smt.relevancy=0.
Lame quantifiers means that you don't expect quantifiers to be instantiated differently depending on some branch condition from a verification conditions. A certain boogie dancing user has a way to generate formulas where quantifier instantiations are sensitive to encoding of a CFG and we found that limiting quantifier instantiations to just the branch being explored by the solver was a benefit. This means that z3 should behave as a semantic tableau solver where it brings in just the amount of necessary literals from a prime implicant to check against quantifiers.

When you start using bit-vectors, however, which wasn't in boogie's vocabulary, the emphasis shifts towards being able to instantiate lemmas early and eagerly.

@rod-chapman
Copy link
Author

You're still talking a foreign language to me... there are no "bit vectors" in my source code. There are integers (signed, but all have range constraints), modular integers (although the modulus is 3329, which is prime, not some convenient power of 2), arrays thereof, and lots of quantification over those arrays.

I tried smt.relevancy=0 on all 621 SMT files that arise from LibMLKEM. The "unsat" count goes up from 591 to 594, so +3 better, with no regressions.

@rod-chapman
Copy link
Author

FWIW, the VCG generating these SMT files is Why3, not Boogie...

@NikolajBjorner
Copy link
Contributor

still close to boogie: why+ecma ~ ymca

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants