- 
                Notifications
    
You must be signed in to change notification settings  - Fork 286
 
Description
It appears the logic in value_set_dereferencet::build_reference_to can produce incorrect pointer_offset expressions, leading to wrong offsets in array accesses.
Example:
struct java_style_array {
  int length;
  void** data;
}data will sometimes be treated generically (as void**) and sometimes with a specific type (e.g. struct java.lang.String**.
Array accesses are currently generated using expressions like:
(java.lang.String*)*(array->data + 2) (pretty-printed as array->data[2])
Suppose the symex phase identifies the actual allocation stored in data as dynamic_1_array and the struct as a whole dynamic_object_2. Then in response to this access it will notice (a) the actual-allocation and reference types don't match, so it must use the byte-extract path at value_set_dereference.cpp:585, and (b) the actual referent is dynamic_1_array.
So far so good: we'd like to see it emit something like (java.lang.String*)dynamic_1_array[2] or less optimistically byte_extract(java.lang.String*, 2 * sizeof(java.lang.String*), dynamic_1_array).
Instead of that offset however we get pointer_offset(dynamic_object_2.data + 2). This is incorrect, as it evaluates to pointer_offset(dynamic_object_2.data) + (2 * sizeof(void*)) = 24, producing an off-by-one error indexing the array.
In its most general terms, I think the problem is that the value-set analysis can walk across object boundaries, thus mapping array.data[2] to the underlying allocation for data not array, but the pointer_offset expression built at value_set_dereference.cpp:590 does not take this into account. In this case it needed to follow the recursive walk over the expression performed by value_sett::get_value_set_rec et al and note that since we had:
array -> dynamic_object_2
array->data -> dynamic_1_array
array->data[2] -> dynamic_1_array
...then we need to output dynamic_1_array + 2 rather than what amounts to dynamic_1_array + (&array->data - array) + 2.
My test cases for this are based on my work in progress on Java arrays, so can't be run against master, but I will follow up with a C test-case if I can reproduce the same behaviour there.