Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Effective type for dynamically allocated memory #732

Open
sim642 opened this issue May 10, 2022 · 0 comments
Open

Effective type for dynamically allocated memory #732

sim642 opened this issue May 10, 2022 · 0 comments
Labels
cleanup Refactoring, clean-up

Comments

@sim642
Copy link
Member

sim642 commented May 10, 2022

Problem

Infamously, our varinfos for dynamically allocated memory are created with void type, because at the point of malloc/etc. we don't yet know, which type the values assigned to it will have. This is only determined at the point of assignment.

This creates the need for a notorious pile of hacks, including but not limited to the following:

  1. Address casts copying the varinfo with updated type (problem for nice implementation of Handle renaming of local variables in incremental analysis (AST) #731):
    | Addr ({ vtype = TVoid _; _} as v, offs) when not (Cilfacade.isCharType t) -> (* we had no information about the type (e.g. malloc), so we add it; ignore for casts to char* since they're special conversions (N1570 6.3.2.3.7) *)
    Addr ({ v with vtype = t }, offs) (* HACK: equal varinfo with different type, causes inconsistencies down the line, when we again assume vtype being "right", but joining etc gives no consideration to which type version to keep *)
  2. Calculating types of offsets of alloc variables fails (because you cannot have offsets on void values), so we have fallbacks like this:

    analyzer/src/analyses/base.ml

    Lines 1141 to 1152 in 35f5b9e

    if a.f (Q.IsHeapVar x) then
    (* the vtype of heap vars will be TVoid, so we need to trust the pointer we got to this to be of the right type *)
    (* i.e. use the static type of the pointer here *)
    lval_type
    else
    try
    Cilfacade.typeOfLval (Var x, cil_offset)
    with Cilfacade.TypeOfError _ ->
    (* If we cannot determine the correct type here, we go with the one of the LVal *)
    (* This will usually lead to a type mismatch in the ValueDomain (and hence supertop) *)
    M.warn "Cilfacade.typeOfLval failed Could not obtain the type of %a" d_lval (Var x, cil_offset);
    lval_type

    There are other places with TypeOfError as well.
  3. Calculating types of address domain elements has a particularly weird fallback:
    let get_type_addr (v,o) = try type_offset v.vtype o with Type_offset (t,_) -> t
    let get_type = function
    | Addr (x, o) -> get_type_addr (x, o)
    | StrPtr _ -> charPtrType (* TODO Cil.charConstPtrType? *)
    | NullPtr -> voidType
    | UnknownPtr -> voidPtrType

    get_type_addr ((alloc@...), Field(x, NoOffset)) gives void, which gets into Type_offset from v.vtype instead of making some kind of assumption and using the field's type instead. That is, the type of an address with offsets might be the type of the variable (without the offsets). This can only cause further mismatches between types and abstract values, yielding to other type lookups also failing.

It's somewhat of a miracle how all these hacks manage to wipe the problem under multiple layers of carpet and have anything reasonable come out for blobs.

Solution

The standard defines (N1570 6.5.6):

The effective type of an object for an access to its stored value is the declared type of the object, if any.1 If a value is stored into an object having no declared type through an lvalue having a type that is not a character type, then the type of the lvalue becomes the effective type of the object for that access and for subsequent accesses that do not modify the stored value. If a value is copied into an object having no declared type using memcpy or memmove, or is copied as an array of character type, then the effective type of the modified object for that access and for subsequent accesses that do not modify the value is the effective type of the object from which the value is copied, if it has one. For all other accesses to an object having no declared type, the effective type of the object is simply the type of the lvalue used for the access.

Given that the C standard has the notion of effective type exactly to describe how dynamically allocated memory gets its type from a subsequent assignment, we should adopt the same notion into the analyzer to avoid all the weird hacks above. Blobs could simply have an extra field for the effective type and any offset type lookups should take the effective type into count on Blobs, instead of erroring and falling back to whatnot.

Footnotes

  1. Allocated objects have no declared type.

@sim642 sim642 added the cleanup Refactoring, clean-up label May 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cleanup Refactoring, clean-up
Projects
None yet
Development

No branches or pull requests

1 participant