Improvement in check_axioms method of the string solver #1241

romainbrenguier · 2017-08-15T16:23:00Z

This improves the method check_axioms of the string solver by padding the result gotten from the solver in a similar way to what is used in the concretize method.
In practice this means that the solver can find more quickly strings that are correct solutions once concretized.

smowton · 2017-08-16T09:37:03Z

src/solvers/refinement/string_refinement.h

Since you're using std::size_t at the callsite, might as well do the same here.

smowton · 2017-08-16T09:40:45Z

src/solvers/refinement/string_refinement.h

Given the line below I guess this must be *(j-1) not (*j)-1. Add parens to indicate which is intended, and note that this is a bad deref on the first iteration if it's *(j-1)

smowton · 2017-08-16T09:46:14Z

src/solvers/refinement/string_refinement.h

Add an invariant to check the concrete_array size.

smowton · 2017-08-16T09:55:55Z

src/solvers/refinement/string_refinement.cpp

Rename this, vector looks too much like the type.

smowton · 2017-08-16T10:18:22Z

src/solvers/refinement/string_refinement.cpp

As I understand, this will turn e.g. { 1, x, 2, x, 3 } into { 1, 2, 2, 3, 3 } ? If so then that's not really padding (padding would make it longer). How about fill_in_vector?

Also is it right that the vector continues to have the extra default entry in the return value?

smowton · 2017-08-16T10:21:22Z

src/solvers/refinement/string_refinement.cpp

rename this after what it does, e.g. rewrite_array_expressions)

what about concretize_array_expression?

smowton · 2017-08-16T10:23:00Z

src/solvers/refinement/string_refinement.cpp

sp literally

smowton · 2017-08-16T10:23:37Z

src/solvers/refinement/string_refinement.cpp

Give a concrete example of the sort of change this produces (e.g. array_of(0) with 1 := 2 -> { 0, 2, 0} ?)

smowton · 2017-08-16T10:24:50Z

src/solvers/refinement/string_refinement.cpp

I don't follow the end-of-line comment? Please elaborate

smowton · 2017-08-16T10:25:45Z

src/solvers/refinement/string_refinement.cpp

At least comment on the kind of difference that can currently occur

actually the statement was wrong, it should have been index.type()==with_expr.where().type. I fixed this and now it works.

allredj · 2017-08-15T16:36:09Z

src/solvers/refinement/string_refinement.cpp

not constraint -> no constraint

allredj · 2017-08-16T10:08:48Z

src/solvers/refinement/string_refinement.h

This introduces n · log(n) complexity but it should not be a problem if the strings are limited in length. What do you think?

Actually I implemented a new version that works with map and should not have this problem. I changed the code to use the new version. Only one usage remains in concretize_strings but we should normally not have to use concretize_strings since it's not called when the model checked is correct.

allredj · 2017-08-16T10:10:32Z

src/solvers/refinement/string_refinement.cpp

interprete_solver_result -> interpret_solver_result?

changed to concretize_array_expression

allredj · 2017-08-16T10:12:37Z

src/solvers/refinement/string_refinement.cpp

Maybe not necessary to declare a new variable.

allredj · 2017-08-16T10:17:43Z

src/solvers/refinement/string_refinement.cpp

@LAJW Does this loop look alright? Or is there a better way of calling the function on the operands?

allredj · 2017-08-16T12:24:20Z

src/solvers/refinement/string_refinement.cpp

Could you document what we do in this loop?

allredj · 2017-08-16T12:25:57Z

src/solvers/refinement/string_refinement.cpp

!error is not necessary here because of the precondition above.

allredj · 2017-08-16T12:28:25Z

src/solvers/refinement/string_refinement.cpp

+2 is not in keeping with the comment above.

Why are we using long here?

allredj · 2017-08-16T13:48:41Z

src/solvers/refinement/string_refinement.cpp

This might produce a huge nesting of expressions. See if we can replace it with a fixed-size array in the case where we have a non-empty index set (which should include the last element).

allredj

Looks good. Some minor comments.

allredj · 2017-08-17T11:28:05Z

src/solvers/refinement/string_refinement.cpp

Remove added line.

allredj · 2017-08-17T11:29:55Z

src/solvers/refinement/string_refinement.cpp

The else case is not clear to me. Can put asserts on the kind of object we get from the solver here?

allredj · 2017-08-17T11:30:41Z

src/solvers/refinement/string_refinement.cpp

I think you need to break after the first (.

allredj · 2017-08-17T11:34:17Z

src/solvers/refinement/string_refinement.cpp

Also say here that we return a with expression.

It now returns a array expression instead of with

allredj · 2017-08-17T11:35:33Z

src/solvers/refinement/string_refinement.cpp

If the returned array is a with_exprt, it should be stated as such here, instead of something of the form { 24, 24, 24, 42, 42 } which implies the function returns an ID_array.

This should indeed be an ID_array expression and with_exprt is expected at the input so I think the documentation correctly reflects the intention of the function

Oh yes right!

allredj · 2017-08-17T11:48:42Z

src/solvers/refinement/string_refinement.cpp

Same comment as above. Make it clear that the nature of the array doesn't change.

it does change now

allredj · 2017-08-17T11:51:36Z

src/solvers/refinement/string_refinement.cpp

Can we add more assertions on the nature of the input array?

Actually the input expression should be any expression, we will just replace inside it the arrays that needs to be.
I can update the documentation to make that clearer

Oh OK, then the function could maybe be renamed. It looks like it takes array expressions.

yes I suggest concretize_arrays_in_expression

allredj · 2017-08-17T11:54:42Z

src/solvers/refinement/string_refinement.cpp

Does it make sense to have unit tests for that function?

It certainly does

So are we adding one? :-)

I've added one for concretize_string which should cover this function https://github.com/diffblue/cbmc/pull/1241/files#diff-da82b969a8fb1d04473386fa029dcc4fR17

allredj · 2017-08-17T12:44:39Z

src/solvers/refinement/string_refinement.h

argument should be const

I think a unit test would be useful here.

I added unit test for concretize_array_expression with should cover this function

allredj · 2017-08-17T12:51:13Z

src/solvers/refinement/string_refinement.h

leftmost index_to pad -> leftmost_index_to_pad or leftmost index to pad

smowton

I think there are still problems with the pad functions

smowton · 2017-08-18T09:01:53Z

src/solvers/refinement/string_refinement.cpp

    {
      exprt index=arr_val.operands()[i*2];
      unsigned idx;
      if(!to_unsigned_integer(to_constant_expr(index), idx))


Braces around multi-line if

smowton · 2017-08-18T09:08:28Z

src/solvers/refinement/string_refinement.cpp

+    const with_exprt with_expr=to_with_expr(it);
+    const exprt &then_expr=with_expr.new_value();
+    mp_integer index;
+    PRECONDITION(to_with_expr(it).where().id()==ID_constant);


This is redundant (to_constant_expr already asserts this)

smowton · 2017-08-18T09:13:49Z

src/solvers/refinement/string_refinement.cpp

+/// \param expr: expression to interpret
+/// \param string_max_length: maximum size of arrays to consider
+/// \return the interpreted expression
+exprt concretize_arrays_in_expression(


This function is probably a good place to use @LAJW's new tree-walking copy-minimising expression iterator: https://github.com/diffblue/cbmc/blob/test-gen-support/src/util/expr_iterator.h

Basically it walks over the tree using const iterators, then when you find an ID_index node with an ID_with child you call .mutate() to get a writable iterator for that particular subexpression, cf. using non-const operands() below which will break all sharing even when there are no changes necessary.

I've done that, it required implementing a new operation in expression iterator, because we want to skip to the next sibling once we changed an expression as it does not make sense to visit children of the old expression.
We also added unit tests for this next_sibling_or_parent operation.

smowton · 2017-08-18T09:22:45Z

src/solvers/refinement/string_refinement.h

-  T1 last_concretized) const
+/// \param initialized: set containing the indices of already concrete values
+template <typename T>
+void fill_in_vector(


Instead of repeated .find calls here, use an explicit iterator and *std::prev(it) to get the previous initialized index.

I changed that now, I was thinking the set was unordered but it is in fact

smowton · 2017-08-18T09:33:41Z

src/solvers/refinement/string_refinement.h

+    const std::size_t leftmost_index_to_pad=
+      pair!=initial_map.rend()?pair->first:0;
+    // pad down to the leftmost index to pad
+    for(std::size_t j=i; j+1!=leftmost_index_to_pad; j--)


? This pads down to leftmost_index_to_pad-1, doesn't it?

If "down to" is inclusive then it should be correct as the last where the inequality holds is leftmost_index_to_pad

Ah yes sorry. Nontheless is inclusive correct? This means if we have adjacent map entries (e.g. 0 -> 10, 1 -> 20, 2 -> 30) then we'll write 30 -> {1, 2} then 20 -> {0, 1} then 10 -> {0}, which gives the correct result but wastefully, and in a way that seems error-prone

Yes you are right the value of left_most_index_to_pad should be the next index +1, I've corrected that

smowton · 2017-08-18T11:12:32Z

src/solvers/refinement/string_refinement.cpp

+/// \return the interpreted expression
+exprt concretize_arrays_in_expression(exprt expr, std::size_t string_max_length)
+{
+  for(auto op=expr.depth_begin(); op!=expr.depth_end(); ++op)


++op must only happen if next_sibling_or_child was not called. Remove from here and below add else ++op

smowton

Looks good, one comment to add

smowton · 2017-08-19T11:35:09Z

src/solvers/refinement/string_refinement.h

Comment something like Conditon would be j>=leftmost_index_to_pad, but it's an unsigned type and leftmost_index_to_pad might be 0.

romainbrenguier · 2017-08-20T20:37:06Z

@LAJW can you check my changes to next_sibling_or_parent ?

LAJW · 2017-08-20T20:40:41Z

src/util/expr_iterator.h

Decent.
Replace (*this)++ with ++(*this) to avoid unnecessary copies.

I knew I should ignore the standard and just forbid the post-increment operator.

allredj

Still looks good! Just a couple of suggestions.

allredj · 2017-08-21T11:40:57Z

src/solvers/refinement/string_refinement.cpp

  if(arr_val.id()=="array-list")
  {
-    std::set<unsigned> initialized;
+    std::map<std::size_t, exprt> initial_map;


Would it make sense to add: DATA_INVARIANT(arr_val.operands().size%2==0)?

allredj · 2017-08-21T12:43:30Z

src/solvers/refinement/string_refinement.cpp

So are we adding one? :-)

allredj · 2017-08-21T16:07:46Z

src/solvers/refinement/string_refinement.cpp

+    if(op->id()==ID_with && op->type().id()==ID_array)
+    {
+      op.mutate()=fill_in_array_with_expr(*op, string_max_length);
+      op.next_sibling_or_parent();


I was quite suspicious about that thing but I now believe it's OK on the basis that we can safely interrupt the depth-first search when we find an array, as we should not find nested arrays after the solver run (even though we don't really check that).

allredj · 2017-08-21T16:16:26Z

src/solvers/refinement/string_refinement.h

-  std::vector<T1> &concrete_array,
-  std::set<T2> &initialized,
-  T1 last_concretized) const
+/// \param initialized: set containing the indices of already concrete values


Could we just be modify:

/// \param initialized: set containing the indices of concrete_array at which we already have a concrete value

This makes it clear that concrete_array[*it] (below) should indeed be initialised.

Actually I think pad_array is no longer used so it can be removed. The function fill_in_map_as_array should be used instead.

romainbrenguier · 2017-08-22T09:54:53Z

unit/solvers/refinement/string_refinement/concretize_array.cpp

should be concretize

romainbrenguier · 2017-08-22T09:57:16Z

src/solvers/refinement/string_refinement.cpp

we should document what this function does

Already documented the template itself - T expr_cast template (just above that snippet).

romainbrenguier · 2017-08-22T13:37:10Z

src/solvers/refinement/string_refinement.h

I don't think you should make this change

peterschrammel · 2017-08-22T21:51:29Z

src/solvers/refinement/string_refinement.cpp

This should go into util/expr_utils

Yes, indeed, but I'm unsure whether that should go with this PR or with another (we've got refactor incoming, I have to stop adding changes at some point).

peterschrammel · 2017-08-22T21:51:55Z

src/solvers/refinement/string_refinement.cpp

This should use DATA_INVARIANT... (but see below).

This should throw. INVARIANT can be put in the catch section.

peterschrammel · 2017-08-22T21:53:47Z

src/solvers/refinement/string_refinement.cpp

This looks like a workaround to hide a bug in the solver.

That's exactly how it worked in the original - failure was used as a filter.

We cannot prevent (for now) the underlying solver to give value to arrays at indexes that have no meaning for us (<0 and >max_string_length). These values should be ignored but there is no bug here.
The clean solution would be to convert the expression to a mp_integer and then ignore it if it is negative or too big.

What would it take to implement that clean solution you're talking about? Could that be put in a refactoring task? Or can it perhaps be part of this one?

We'd need to cherry pick some commits.

peterschrammel · 2017-08-22T21:59:36Z

src/solvers/refinement/string_refinement.cpp

Even if this is a genuine check (that cannot be avoided by fixing an alleged bug in the solver -- see below) I'd rather refactor this function to check_index (or similar) and make it return a boolean so that exceptions are not used as part of normal operation. Exceptions should be used for exceptional behaviour only.

We've got two problems: concealing errors because they're returned as error codes and lack of unified way of handling those. Returning error code without any guards is extremely error prone - it's very easy not to check for the error (there is a solution for that, but that should be another task). Also it's not always obvious whether false means success or true. I'm just trying to be consistent with std::dynamic_cast<ref> and boost::lexical_cast - they do throw on failure.

In this case throwing terminates the application, as exception is not handled. That call could be wrapped as invariant but overall result is the same. What I'd like to avoid is having 10 different versions of the "cast exprt to int" each forgetting to check different invariants and having its own set of bugs and each taking 5-10 lines of code and polluting the codebase.

I'd like to swap those true/false returns and exceptions with boost/std::optional and/or expected, but I'd prefer not to include them here.

peterschrammel · 2017-08-22T22:00:57Z

src/solvers/refinement/string_refinement.cpp

add a comment that ++op; is intentionally omitted

peterschrammel · 2017-08-22T22:01:25Z

src/solvers/refinement/string_refinement.h

peterschrammel · 2017-08-22T22:01:52Z

src/solvers/refinement/string_refinement.h

break after = and put the whole thing on the next line

peterschrammel · 2017-08-23T11:23:52Z

@LAJW, the cosmetic changes do not touch any code created outside this PR. Please squash these changes into the respective commits in this PR that introduced the code that is being fixed.

Cbmc can potentialy run out of memory if no maximum string length is set. This happens more often with the new version of check axioms because a concretization step is made to be more precise in the check.

This allows to skip childrens of a node in iteration. Useful when substituting an expression with another on which we do not want to iterate.

This function takes a map as argument which is more appropriate for the operation to perform. This also refactors `concretize_string` and `get_array` to use fill_in_map_as_vector

This interpret the result by propagating values that are set, to the left.

This interpret them in a way that makes sense to the string solver, by propagating values to the left.

This makes the string solver more likely to come up with a solution to the formulas that are passed to it.

romainbrenguier requested review from allredj and smowton August 15, 2017 16:23

romainbrenguier force-pushed the bugfix/check-axioms#874 branch from dbd7bad to f5a69a7 Compare August 16, 2017 06:07

smowton reviewed Aug 16, 2017

View reviewed changes

allredj suggested changes Aug 16, 2017

View reviewed changes

romainbrenguier requested a review from LAJW August 17, 2017 11:24

allredj suggested changes Aug 17, 2017

View reviewed changes

romainbrenguier requested review from allredj and smowton August 17, 2017 21:04

romainbrenguier force-pushed the bugfix/check-axioms#874 branch from 232de69 to d44a061 Compare August 18, 2017 08:03

romainbrenguier mentioned this pull request Aug 18, 2017

String max input length option #1235

Merged

romainbrenguier force-pushed the bugfix/check-axioms#874 branch from d44a061 to 884c1ae Compare August 18, 2017 08:58

smowton suggested changes Aug 18, 2017

View reviewed changes

smowton reviewed Aug 18, 2017

View reviewed changes

romainbrenguier requested a review from smowton August 18, 2017 12:54

smowton approved these changes Aug 19, 2017

View reviewed changes

tautschnig assigned romainbrenguier Aug 20, 2017

romainbrenguier force-pushed the bugfix/check-axioms#874 branch from 062f445 to 6b4675b Compare August 20, 2017 18:16

romainbrenguier force-pushed the bugfix/check-axioms#874 branch from 6b4675b to e58d4da Compare August 20, 2017 20:38

LAJW suggested changes Aug 20, 2017

View reviewed changes

romainbrenguier force-pushed the bugfix/check-axioms#874 branch 2 times, most recently from d7f4fff to 0161397 Compare August 21, 2017 10:39

LAJW force-pushed the bugfix/check-axioms#874 branch from bc21958 to a697c62 Compare August 21, 2017 16:32

allredj approved these changes Aug 21, 2017

View reviewed changes

romainbrenguier commented Aug 22, 2017

View reviewed changes

unit/solvers/refinement/string_refinement/concretize_array.cpp Outdated

Copy link

Contributor Author

romainbrenguier Aug 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be concretize

romainbrenguier commented Aug 22, 2017

View reviewed changes

romainbrenguier assigned peterschrammel Aug 22, 2017

tautschnig changed the base branch from test-gen-support to develop August 22, 2017 12:15

LAJW approved these changes Aug 22, 2017

View reviewed changes

romainbrenguier commented Aug 22, 2017

View reviewed changes

src/solvers/refinement/string_refinement.h Outdated

Copy link

Contributor Author

romainbrenguier Aug 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you should make this change

LAJW force-pushed the bugfix/check-axioms#874 branch from bb0f902 to b4fd43f Compare August 22, 2017 14:43

peterschrammel requested changes Aug 22, 2017

View reviewed changes

LAJW force-pushed the bugfix/check-axioms#874 branch 4 times, most recently from 7610c58 to 1576d2a Compare August 24, 2017 09:24

romainbrenguier force-pushed the bugfix/check-axioms#874 branch from f4cd015 to 90e7dbb Compare August 29, 2017 09:15

romainbrenguier added 12 commits August 29, 2017 10:28

Setting string-max-length for several tests

9a8c063

Cbmc can potentialy run out of memory if no maximum string length is set. This happens more often with the new version of check axioms because a concretization step is made to be more precise in the check.

Adding a next_sibling_or_parent iterator

f255abb

This allows to skip childrens of a node in iteration. Useful when substituting an expression with another on which we do not want to iterate.

Adding unit tests for next_sibling_or_parent of an iterator

5d3771a

Better debug info

df61ceb

Function for conversion from expression to size_t

de03d9c

Replace pad_vector by fill_in_map_as_vector

f042ff7

This function takes a map as argument which is more appropriate for the operation to perform. This also refactors `concretize_string` and `get_array` to use fill_in_map_as_vector

Adding return check in substitute_array_with_expr

1fd5bb2

Add a function that fill an array represented by a with expression

88c664c

This interpret the result by propagating values that are set, to the left.

Remove useless line break

0b9811b

Add a function interpreting arrays inside an expression

e18a6bd

This interpret them in a way that makes sense to the string solver, by propagating values to the left.

Concretize arrays in axioms when string solver checks axioms

2c7f23c

This makes the string solver more likely to come up with a solution to the formulas that are passed to it.

Adding unit test for concretize_arrays_in_expression

8f51b01

romainbrenguier force-pushed the bugfix/check-axioms#874 branch from 90e7dbb to 8f51b01 Compare August 29, 2017 10:11

peterschrammel approved these changes Aug 29, 2017

View reviewed changes

kroening merged commit df16159 into diffblue:develop Aug 29, 2017

tautschnig mentioned this pull request Nov 8, 2017

Move expr_cast from refinement to util/convert_expr.h #1572

Merged

Improvement in check_axioms method of the string solver #1241

Improvement in check_axioms method of the string solver #1241

Uh oh!

Conversation

romainbrenguier commented Aug 15, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

allredj left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment