Skip to content

Conversation

@romainbrenguier
Copy link
Contributor

This improves the method check_axioms of the string solver by padding the result gotten from the solver in a similar way to what is used in the concretize method.
In practice this means that the solver can find more quickly strings that are correct solutions once concretized.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you're using std::size_t at the callsite, might as well do the same here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the line below I guess this must be *(j-1) not (*j)-1. Add parens to indicate which is intended, and note that this is a bad deref on the first iteration if it's *(j-1)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add an invariant to check the concrete_array size.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename this, vector looks too much like the type.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I understand, this will turn e.g. { 1, x, 2, x, 3 } into { 1, 2, 2, 3, 3 } ? If so then that's not really padding (padding would make it longer). How about fill_in_vector?

Also is it right that the vector continues to have the extra default entry in the return value?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename this after what it does, e.g. rewrite_array_expressions)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about concretize_array_expression?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sp literally

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Give a concrete example of the sort of change this produces (e.g. array_of(0) with 1 := 2 -> { 0, 2, 0} ?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't follow the end-of-line comment? Please elaborate

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least comment on the kind of difference that can currently occur

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually the statement was wrong, it should have been index.type()==with_expr.where().type. I fixed this and now it works.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not constraint -> no constraint

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This introduces n · log(n) complexity but it should not be a problem if the strings are limited in length. What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I implemented a new version that works with map and should not have this problem. I changed the code to use the new version. Only one usage remains in concretize_strings but we should normally not have to use concretize_strings since it's not called when the model checked is correct.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

interprete_solver_result -> interpret_solver_result?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed to concretize_array_expression

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe not necessary to declare a new variable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@LAJW Does this loop look alright? Or is there a better way of calling the function on the operands?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you document what we do in this loop?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

!error is not necessary here because of the precondition above.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+2 is not in keeping with the comment above.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we using long here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might produce a huge nesting of expressions. See if we can replace it with a fixed-size array in the case where we have a non-empty index set (which should include the last element).

@romainbrenguier romainbrenguier requested a review from LAJW August 17, 2017 11:24
Copy link
Contributor

@allredj allredj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Some minor comments.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove added line.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The else case is not clear to me. Can put asserts on the kind of object we get from the solver here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you need to break after the first (.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also say here that we return a with expression.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It now returns a array expression instead of with

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the returned array is a with_exprt, it should be stated as such here, instead of something of the form { 24, 24, 24, 42, 42 } which implies the function returns an ID_array.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should indeed be an ID_array expression and with_exprt is expected at the input so I think the documentation correctly reflects the intention of the function

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yes right!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above. Make it clear that the nature of the array doesn't change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it does change now

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add more assertions on the nature of the input array?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually the input expression should be any expression, we will just replace inside it the arrays that needs to be.
I can update the documentation to make that clearer

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh OK, then the function could maybe be renamed. It looks like it takes array expressions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes I suggest concretize_arrays_in_expression

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to have unit tests for that function?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It certainly does

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So are we adding one? :-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added one for concretize_string which should cover this function https://github.com/diffblue/cbmc/pull/1241/files#diff-da82b969a8fb1d04473386fa029dcc4fR17

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

argument should be const

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a unit test would be useful here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added unit test for concretize_array_expression with should cover this function

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

leftmost index_to pad -> leftmost_index_to_pad or leftmost index to pad

Copy link
Contributor

@smowton smowton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are still problems with the pad functions

{
exprt index=arr_val.operands()[i*2];
unsigned idx;
if(!to_unsigned_integer(to_constant_expr(index), idx))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Braces around multi-line if

const with_exprt with_expr=to_with_expr(it);
const exprt &then_expr=with_expr.new_value();
mp_integer index;
PRECONDITION(to_with_expr(it).where().id()==ID_constant);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is redundant (to_constant_expr already asserts this)

/// \param expr: expression to interpret
/// \param string_max_length: maximum size of arrays to consider
/// \return the interpreted expression
exprt concretize_arrays_in_expression(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is probably a good place to use @LAJW's new tree-walking copy-minimising expression iterator: https://github.com/diffblue/cbmc/blob/test-gen-support/src/util/expr_iterator.h

Basically it walks over the tree using const iterators, then when you find an ID_index node with an ID_with child you call .mutate() to get a writable iterator for that particular subexpression, cf. using non-const operands() below which will break all sharing even when there are no changes necessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've done that, it required implementing a new operation in expression iterator, because we want to skip to the next sibling once we changed an expression as it does not make sense to visit children of the old expression.
We also added unit tests for this next_sibling_or_parent operation.

T1 last_concretized) const
/// \param initialized: set containing the indices of already concrete values
template <typename T>
void fill_in_vector(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of repeated .find calls here, use an explicit iterator and *std::prev(it) to get the previous initialized index.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed that now, I was thinking the set was unordered but it is in fact

const std::size_t leftmost_index_to_pad=
pair!=initial_map.rend()?pair->first:0;
// pad down to the leftmost index to pad
for(std::size_t j=i; j+1!=leftmost_index_to_pad; j--)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

? This pads down to leftmost_index_to_pad-1, doesn't it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If "down to" is inclusive then it should be correct as the last where the inequality holds is leftmost_index_to_pad

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes sorry. Nontheless is inclusive correct? This means if we have adjacent map entries (e.g. 0 -> 10, 1 -> 20, 2 -> 30) then we'll write 30 -> {1, 2} then 20 -> {0, 1} then 10 -> {0}, which gives the correct result but wastefully, and in a way that seems error-prone

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes you are right the value of left_most_index_to_pad should be the next index +1, I've corrected that

/// \return the interpreted expression
exprt concretize_arrays_in_expression(exprt expr, std::size_t string_max_length)
{
for(auto op=expr.depth_begin(); op!=expr.depth_end(); ++op)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++op must only happen if next_sibling_or_child was not called. Remove from here and below add else ++op

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Contributor

@smowton smowton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, one comment to add

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment something like Conditon would be j>=leftmost_index_to_pad, but it's an unsigned type and leftmost_index_to_pad might be 0.

@romainbrenguier
Copy link
Contributor Author

@LAJW can you check my changes to next_sibling_or_parent ?

@romainbrenguier romainbrenguier force-pushed the bugfix/check-axioms#874 branch from 6b4675b to e58d4da Compare August 20, 2017 20:38
Copy link
Contributor

@LAJW LAJW Aug 20, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Decent.
Replace (*this)++ with ++(*this) to avoid unnecessary copies.

I knew I should ignore the standard and just forbid the post-increment operator.

@romainbrenguier romainbrenguier force-pushed the bugfix/check-axioms#874 branch 2 times, most recently from d7f4fff to 0161397 Compare August 21, 2017 10:39
@LAJW LAJW force-pushed the bugfix/check-axioms#874 branch from bc21958 to a697c62 Compare August 21, 2017 16:32
Copy link
Contributor

@allredj allredj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still looks good! Just a couple of suggestions.

if(arr_val.id()=="array-list")
{
std::set<unsigned> initialized;
std::map<std::size_t, exprt> initial_map;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to add: DATA_INVARIANT(arr_val.operands().size%2==0)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So are we adding one? :-)

if(op->id()==ID_with && op->type().id()==ID_array)
{
op.mutate()=fill_in_array_with_expr(*op, string_max_length);
op.next_sibling_or_parent();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was quite suspicious about that thing but I now believe it's OK on the basis that we can safely interrupt the depth-first search when we find an array, as we should not find nested arrays after the solver run (even though we don't really check that).

std::vector<T1> &concrete_array,
std::set<T2> &initialized,
T1 last_concretized) const
/// \param initialized: set containing the indices of already concrete values
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we just be modify:

/// \param initialized: set containing the indices of concrete_array at which we already have a concrete value

This makes it clear that concrete_array[*it] (below) should indeed be initialised.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I think pad_array is no longer used so it can be removed. The function fill_in_map_as_array should be used instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be concretize

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should document what this function does

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already documented the template itself - T expr_cast template (just above that snippet).

@tautschnig tautschnig changed the base branch from test-gen-support to develop August 22, 2017 12:15
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you should make this change

@LAJW LAJW force-pushed the bugfix/check-axioms#874 branch from bb0f902 to b4fd43f Compare August 22, 2017 14:43
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should go into util/expr_utils

Copy link
Contributor

@LAJW LAJW Aug 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, indeed, but I'm unsure whether that should go with this PR or with another (we've got refactor incoming, I have to stop adding changes at some point).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should use DATA_INVARIANT... (but see below).

Copy link
Contributor

@LAJW LAJW Aug 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should throw. INVARIANT can be put in the catch section.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a workaround to hide a bug in the solver.

Copy link
Contributor

@LAJW LAJW Aug 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's exactly how it worked in the original - failure was used as a filter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We cannot prevent (for now) the underlying solver to give value to arrays at indexes that have no meaning for us (<0 and >max_string_length). These values should be ignored but there is no bug here.
The clean solution would be to convert the expression to a mp_integer and then ignore it if it is negative or too big.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would it take to implement that clean solution you're talking about? Could that be put in a refactoring task? Or can it perhaps be part of this one?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'd need to cherry pick some commits.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if this is a genuine check (that cannot be avoided by fixing an alleged bug in the solver -- see below) I'd rather refactor this function to check_index (or similar) and make it return a boolean so that exceptions are not used as part of normal operation. Exceptions should be used for exceptional behaviour only.

Copy link
Contributor

@LAJW LAJW Aug 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've got two problems: concealing errors because they're returned as error codes and lack of unified way of handling those. Returning error code without any guards is extremely error prone - it's very easy not to check for the error (there is a solution for that, but that should be another task). Also it's not always obvious whether false means success or true. I'm just trying to be consistent with std::dynamic_cast<ref> and boost::lexical_cast - they do throw on failure.

In this case throwing terminates the application, as exception is not handled. That call could be wrapped as invariant but overall result is the same. What I'd like to avoid is having 10 different versions of the "cast exprt to int" each forgetting to check different invariants and having its own set of bugs and each taking 5-10 lines of code and polluting the codebase.

I'd like to swap those true/false returns and exceptions with boost/std::optional and/or expected, but I'd prefer not to include them here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a comment that ++op; is intentionally omitted

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

length

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

break after = and put the whole thing on the next line

@peterschrammel
Copy link
Member

@LAJW, the cosmetic changes do not touch any code created outside this PR. Please squash these changes into the respective commits in this PR that introduced the code that is being fixed.

@LAJW LAJW force-pushed the bugfix/check-axioms#874 branch 4 times, most recently from 7610c58 to 1576d2a Compare August 24, 2017 09:24
@romainbrenguier romainbrenguier force-pushed the bugfix/check-axioms#874 branch from f4cd015 to 90e7dbb Compare August 29, 2017 09:15
Cbmc can potentialy run out of memory if no maximum string length is
set. This happens more often with the new version of check axioms
because a concretization step is made to be more precise in the check.
This allows to skip childrens of a node in iteration.
Useful when substituting an expression with another on which we do not
want to iterate.
This function takes a map as argument which is more appropriate for
the operation to perform.

This also refactors `concretize_string` and `get_array` to use fill_in_map_as_vector
This interpret the result by propagating values that are set, to the
left.
This interpret them in a way that makes sense to the string solver, by
propagating values to the left.
This makes the string solver more likely to come up with a solution to
the formulas that are passed to it.
@romainbrenguier romainbrenguier force-pushed the bugfix/check-axioms#874 branch from 90e7dbb to 8f51b01 Compare August 29, 2017 10:11
@kroening kroening merged commit df16159 into diffblue:develop Aug 29, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants