Effective type for dynamically allocated memory #732

sim642 · 2022-05-10T07:40:55Z

Problem

Infamously, our varinfos for dynamically allocated memory are created with void type, because at the point of malloc/etc. we don't yet know, which type the values assigned to it will have. This is only determined at the point of assignment.

This creates the need for a notorious pile of hacks, including but not limited to the following:

Address casts copying the varinfo with updated type (problem for nice implementation of Handle renaming of local variables in incremental analysis (AST) #731):

analyzer/src/cdomains/valueDomain.ml

Lines 332 to 333 in 35f5b9e

    
           | Addr ({ vtype = TVoid _; _} as v, offs) when not (Cilfacade.isCharType t) -> (* we had no information about the type (e.g. malloc), so we add it; ignore for casts to char* since they're special conversions (N1570 6.3.2.3.7) *) 
        
             Addr ({ v with vtype = t }, offs) (* HACK: equal varinfo with different type, causes inconsistencies down the line, when we again assume vtype being "right", but joining etc gives no consideration to which type version to keep *)

Calculating types of offsets of alloc variables fails (because you cannot have offsets on void values), so we have fallbacks like this:

analyzer/src/analyses/base.ml

Lines 1141 to 1152 in 35f5b9e

    
           if a.f (Q.IsHeapVar x) then 
        
             (* the vtype of heap vars will be TVoid, so we need to trust the pointer we got to this to be of the right type *) 
        
             (* i.e. use the static type of the pointer here *) 
        
             lval_type 
        
           else 
        
             try 
        
               Cilfacade.typeOfLval (Var x, cil_offset) 
        
             with Cilfacade.TypeOfError _ -> 
        
               (* If we cannot determine the correct type here, we go with the one of the LVal *) 
        
               (* This will usually lead to a type mismatch in the ValueDomain (and hence supertop) *) 
        
               M.warn "Cilfacade.typeOfLval failed Could not obtain the type of %a" d_lval (Var x, cil_offset); 
        
               lval_type

There are other places with TypeOfError as well.

Calculating types of address domain elements has a particularly weird fallback:

analyzer/src/cdomains/lval.ml

Lines 257 to 263 in 35f5b9e

    
           let get_type_addr (v,o) = try type_offset v.vtype o with Type_offset (t,_) -> t 
        
           let get_type = function 
        
             | Addr (x, o) -> get_type_addr (x, o) 
        
             | StrPtr _ -> charPtrType (* TODO Cil.charConstPtrType? *) 
        
             | NullPtr  -> voidType 
        
             | UnknownPtr -> voidPtrType

get_type_addr ((alloc@...), Field(x, NoOffset)) gives void, which gets into Type_offset from v.vtype instead of making some kind of assumption and using the field's type instead. That is, the type of an address with offsets might be the type of the variable (without the offsets). This can only cause further mismatches between types and abstract values, yielding to other type lookups also failing.

It's somewhat of a miracle how all these hacks manage to wipe the problem under multiple layers of carpet and have anything reasonable come out for blobs.

Solution

The standard defines (N1570 6.5.6):

The effective type of an object for an access to its stored value is the declared type of the object, if any.¹ If a value is stored into an object having no declared type through an lvalue having a type that is not a character type, then the type of the lvalue becomes the effective type of the object for that access and for subsequent accesses that do not modify the stored value. If a value is copied into an object having no declared type using memcpy or memmove, or is copied as an array of character type, then the effective type of the modified object for that access and for subsequent accesses that do not modify the value is the effective type of the object from which the value is copied, if it has one. For all other accesses to an object having no declared type, the effective type of the object is simply the type of the lvalue used for the access.

Given that the C standard has the notion of effective type exactly to describe how dynamically allocated memory gets its type from a subsequent assignment, we should adopt the same notion into the analyzer to avoid all the weird hacks above. Blobs could simply have an extra field for the effective type and any offset type lookups should take the effective type into count on Blobs, instead of erroring and falling back to whatnot.

Allocated objects have no declared type. ↩

The text was updated successfully, but these errors were encountered:

sim642 added the cleanup Refactoring, clean-up label May 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Effective type for dynamically allocated memory #732

Effective type for dynamically allocated memory #732

sim642 commented May 10, 2022

Effective type for dynamically allocated memory #732

Effective type for dynamically allocated memory #732

Comments

sim642 commented May 10, 2022

Problem

Solution

Footnotes