Strongly update-able `calloc` blobs for more precise zstd analysis #702

sim642 · 2022-04-26T13:16:32Z

Problem

Our analysis of zstd ends up with unexpectedly many unknown pointers due to the following sequence of coincidences.

Thread pool and its queue

In zstd's POOL_create_advanced, the thread pool and its queue are allocated on the heap as follows (simplified):

ctx = (POOL_ctx*)ZSTD_customCalloc(sizeof(POOL_ctx), customMem);
ctx->queueSize = queueSize + 1;
ctx->queue = (POOL_job*)ZSTD_customMalloc(ctx->queueSize * sizeof(POOL_job), customMem);
ctx->queueHead = 0;
// ...

Here ctx and both alloc variables are in the local state. Thanks to #686 the alloc variables escape into global state correctly.

When jobs are added to the queue, their function pointers are nicely joined into the queue's alloc variable, which precisely contains known function pointers and no NULL or unknown pointer.

However later in the thread pool threads, they retrieve jobs from the queue using ctx->queue[ctx->queueHead], i.e. *(ctx->queue + ctx->queueHead) after CIL normalization. Confusingly, the function pointer in the returned job struct contains an additional unknown pointer, despite the queue's alloc variable not containing one.

Extensive invalidation

Turns out this is because ctx's alloc variable also has a NULL pointer possible for the queue field. When pointer arithmetic is applied (and queueHead is no longer constant either), the known alloc pointer is nicely kept, but the NULL pointer is replaced by an unknown pointer using this code:

analyzer/src/analyses/base.ml

Lines 208 to 214 in 3137936

    
           let default = function 
        
             | Addr.NullPtr when GobOption.exists (BI.equal BI.zero) (ID.to_int n) -> Addr.NullPtr 
        
             | _ -> Addr.UnknownPtr 
        
           in 
        
           match Addr.to_var_offset addr with 
        
           | Some (x, o) -> Addr.from_var_offset (x, addToOffset n (Some x.vtype) o) 
        
           | None -> default addr

Analyzing zstd, the unknown function pointer is absolutely disastrous effect:

An unknown function pointer call is analyzed by special to invalidate its arguments.
In zstd, one of the thread pool jobs is passed an argument, which inside it contains a pointer to the same thread pool ctx.
The resulting invalidation is extensive and invalidates:
- ctx alloc variable (so an unknown pointer is added to queue's address set directly as well),
- queue alloc variable (so NULL and unknown pointers are added to the job's function pointer and argument sets),
- all the argument structs for all the jobs (holding all of the relevant data) are totally invalidated.

Culprit

The culprit of all of this is the NULL pointer in the queue field. If we didn't have that there, none of this invalidation would have happened!
Tracking this down was a two-day job since the cyclic pointing structure means that everything ends up being invalidated and it's impossible to tell from the final results, where this imprecision originated from. Moreover, it's practically impossible to trace Goblint when the problematic analysis takes over 40 minutes and produces way too much tracing output (and not to mention being slowed down by all the I/O).

Solution

Ideally, Goblint should know that queue cannot be a NULL pointer, avoiding all of the above. The source of the NULL pointer in the first place is that ctx is allocated using calloc. But clearly that NULL pointer is overwritten before ctx is seen by any other code, so it should be avoidable.

Although simple on the surface, to be able to strongly update the queue field requires many big pieces to work together:

Uniqueness analysis for Juliet suite std_thread wrappers #186 is needed to know that we have a uniquely allocated blob, otherwise strong updates wouldn't be sound.

ValueDomain.update_offset must use that uniqueness information to replace the weak (joining) update with a strong (replacing) one here:

analyzer/src/cdomains/valueDomain.ml

Lines 900 to 904 in 3137936

    
           | `Blob (x,s,orig), _ -> 
        
             begin 
        
               let l', o' = shift_one_over l o in 
        
               let x = zero_init_calloced_memory orig x t in 
        
               mu (`Blob (join x (do_update_offset ask x offs value exp l' o' v t), s, orig))

Even that would not be enough, because calloc creates an abstract array of blobs:

analyzer/src/analyses/base.ml

Line 2261 in 3137936

    
           set_many ~ctx (Analyses.ask_of_ctx ctx) gs st [(add_null (AD.from_var heap_var), TVoid [], `Array (CArrays.make (IdxDom.of_int (Cilfacade.ptrdiff_ikind ()) BI.one) (`Blob (VD.bot (), eval_int (Analyses.ask_of_ctx ctx) gs st size, false))));

Therefore that array update also needs to be strong, not weak. In zstd's case (and very commonly), calloc is used with a count of 1, so we get an array of length 1. Hence the array domain(s) should be specialized for the length 1 case to just do strong updates instead.
(After Fixes/calloc #145 there's probably a bug in that calloc code because it always creates an array of length 1, instead of using the count from the calloc argument.)

Even with all of the above, I'm not sure if we'd be precise enough, because in zstd thread pools may be created when Goblint is in multithreaded mode, so it might even be necessary to incorporate Primitive malloc freshness analysis #690 into the uniqueness and privatization to always have a local strongly update-able copy of the blob.

The text was updated successfully, but these errors were encountered:

michael-schwarz · 2022-04-26T14:11:59Z

and 2. are actually things that are all tackled within Uniqueness analysis for Juliet suite std_thread wrappers #186 by our Master's Practical student who will hopefully make a PR any day now. His final presentation is scheduled for next week. I think on top of 1 and 2, one needs to ensure that the entire blob is written at the same time (by checking the size of the pointer that is used to write) to actually perform a string update.
Otherwise, one might just be updating one part of the blob and not all of it.

(After #145 there's probably a bug in that calloc code because it always creates an array of length 1, instead of using the count from the calloc argument.)

I don't think so, since the allocated blob already has the product of both calloc arguments as its length. The problem is that nothing enforces that what is allocated via calloc is actually used an array.

sim642 · 2022-04-26T14:37:28Z

and 2. are actually things that are all tackled within Uniqueness analysis for Juliet suite std_thread wrappers #186 by our Master's Practical student who will hopefully make a PR any day now.

I guess the ultimate test of the student's work will be whether it's able to fix this whole zstd imprecision.

I think on top of 1 and 2, one needs to ensure that the entire blob is written at the same time (by checking the size of the pointer that is used to write) to actually perform a string update.
Otherwise, one might just be updating one part of the blob and not all of it.

Something along the lines might be necessary indeed. Although this is even more crucial for general things like #696.

I don't think so, since the allocated blob already has the product of both calloc arguments as its length.

This is actually exactly what #145 changed: it used to be a product, but it was changed to be kept separate (which is more general), but it looks like the Calloc comment in base analysis was forgotten to be removed.

michael-schwarz · 2022-07-27T12:09:52Z

We should investigate how much of this is resolved by #722.

sim642 added feature precision benchmarking labels Apr 26, 2022

sim642 mentioned this issue May 4, 2022

Malloc uniqueness #722

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Strongly update-able `calloc` blobs for more precise zstd analysis #702

Strongly update-able `calloc` blobs for more precise zstd analysis #702

sim642 commented Apr 26, 2022

michael-schwarz commented Apr 26, 2022

sim642 commented Apr 26, 2022

michael-schwarz commented Jul 27, 2022

Strongly update-able calloc blobs for more precise zstd analysis #702

Strongly update-able calloc blobs for more precise zstd analysis #702

Comments

sim642 commented Apr 26, 2022

Problem

Thread pool and its queue

Extensive invalidation

Culprit

Solution

michael-schwarz commented Apr 26, 2022

sim642 commented Apr 26, 2022

michael-schwarz commented Jul 27, 2022

Strongly update-able `calloc` blobs for more precise zstd analysis #702

Strongly update-able `calloc` blobs for more precise zstd analysis #702