You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Here ctx and both alloc variables are in the local state. Thanks to #686 the alloc variables escape into global state correctly.
When jobs are added to the queue, their function pointers are nicely joined into the queue's alloc variable, which precisely contains known function pointers and no NULL or unknown pointer.
However later in the thread pool threads, they retrieve jobs from the queue using ctx->queue[ctx->queueHead], i.e. *(ctx->queue + ctx->queueHead) after CIL normalization. Confusingly, the function pointer in the returned job struct contains an additional unknown pointer, despite the queue's alloc variable not containing one.
Extensive invalidation
Turns out this is because ctx's alloc variable also has a NULL pointer possible for the queue field. When pointer arithmetic is applied (and queueHead is no longer constant either), the known alloc pointer is nicely kept, but the NULL pointer is replaced by an unknown pointer using this code:
Analyzing zstd, the unknown function pointer is absolutely disastrous effect:
An unknown function pointer call is analyzed by special to invalidate its arguments.
In zstd, one of the thread pool jobs is passed an argument, which inside it contains a pointer to the same thread pool ctx.
The resulting invalidation is extensive and invalidates:
ctx alloc variable (so an unknown pointer is added to queue's address set directly as well),
queue alloc variable (so NULL and unknown pointers are added to the job's function pointer and argument sets),
all the argument structs for all the jobs (holding all of the relevant data) are totally invalidated.
Culprit
The culprit of all of this is the NULL pointer in the queue field. If we didn't have that there, none of this invalidation would have happened!
Tracking this down was a two-day job since the cyclic pointing structure means that everything ends up being invalidated and it's impossible to tell from the final results, where this imprecision originated from. Moreover, it's practically impossible to trace Goblint when the problematic analysis takes over 40 minutes and produces way too much tracing output (and not to mention being slowed down by all the I/O).
Solution
Ideally, Goblint should know that queue cannot be a NULL pointer, avoiding all of the above. The source of the NULL pointer in the first place is that ctx is allocated using calloc. But clearly that NULL pointer is overwritten before ctx is seen by any other code, so it should be avoidable.
Although simple on the surface, to be able to strongly update the queue field requires many big pieces to work together:
Therefore that array update also needs to be strong, not weak. In zstd's case (and very commonly), calloc is used with a count of 1, so we get an array of length 1. Hence the array domain(s) should be specialized for the length 1 case to just do strong updates instead.
(After Fixes/calloc #145 there's probably a bug in that calloc code because it always creates an array of length 1, instead of using the count from the calloc argument.)
Even with all of the above, I'm not sure if we'd be precise enough, because in zstd thread pools may be created when Goblint is in multithreaded mode, so it might even be necessary to incorporate Primitive malloc freshness analysis #690 into the uniqueness and privatization to always have a local strongly update-able copy of the blob.
The text was updated successfully, but these errors were encountered:
and 2. are actually things that are all tackled within Uniqueness analysis for Juliet suite std_thread wrappers #186 by our Master's Practical student who will hopefully make a PR any day now. His final presentation is scheduled for next week. I think on top of 1 and 2, one needs to ensure that the entire blob is written at the same time (by checking the size of the pointer that is used to write) to actually perform a string update.
Otherwise, one might just be updating one part of the blob and not all of it.
(After #145 there's probably a bug in that calloc code because it always creates an array of length 1, instead of using the count from the calloc argument.)
I don't think so, since the allocated blob already has the product of both calloc arguments as its length. The problem is that nothing enforces that what is allocated via calloc is actually used an array.
I guess the ultimate test of the student's work will be whether it's able to fix this whole zstd imprecision.
I think on top of 1 and 2, one needs to ensure that the entire blob is written at the same time (by checking the size of the pointer that is used to write) to actually perform a string update.
Otherwise, one might just be updating one part of the blob and not all of it.
Something along the lines might be necessary indeed. Although this is even more crucial for general things like #696.
I don't think so, since the allocated blob already has the product of both calloc arguments as its length.
This is actually exactly what #145 changed: it used to be a product, but it was changed to be kept separate (which is more general), but it looks like the Calloc comment in base analysis was forgotten to be removed.
Problem
Our analysis of zstd ends up with unexpectedly many unknown pointers due to the following sequence of coincidences.
Thread pool and its queue
In zstd's
POOL_create_advanced
, the thread pool and its queue are allocated on the heap as follows (simplified):Here
ctx
and both alloc variables are in the local state. Thanks to #686 the alloc variables escape into global state correctly.When jobs are added to the queue, their function pointers are nicely joined into the queue's alloc variable, which precisely contains known function pointers and no
NULL
or unknown pointer.However later in the thread pool threads, they retrieve jobs from the queue using
ctx->queue[ctx->queueHead]
, i.e.*(ctx->queue + ctx->queueHead)
after CIL normalization. Confusingly, the function pointer in the returned job struct contains an additional unknown pointer, despite the queue's alloc variable not containing one.Extensive invalidation
Turns out this is because ctx's alloc variable also has a
NULL
pointer possible for thequeue
field. When pointer arithmetic is applied (andqueueHead
is no longer constant either), the known alloc pointer is nicely kept, but theNULL
pointer is replaced by an unknown pointer using this code:analyzer/src/analyses/base.ml
Lines 208 to 214 in 3137936
Analyzing zstd, the unknown function pointer is absolutely disastrous effect:
special
to invalidate its arguments.ctx
.ctx
alloc variable (so an unknown pointer is added toqueue
's address set directly as well),queue
alloc variable (soNULL
and unknown pointers are added to the job's function pointer and argument sets),Culprit
The culprit of all of this is the
NULL
pointer in thequeue
field. If we didn't have that there, none of this invalidation would have happened!Tracking this down was a two-day job since the cyclic pointing structure means that everything ends up being invalidated and it's impossible to tell from the final results, where this imprecision originated from. Moreover, it's practically impossible to trace Goblint when the problematic analysis takes over 40 minutes and produces way too much tracing output (and not to mention being slowed down by all the I/O).
Solution
Ideally, Goblint should know that
queue
cannot be aNULL
pointer, avoiding all of the above. The source of theNULL
pointer in the first place is thatctx
is allocated usingcalloc
. But clearly thatNULL
pointer is overwritten beforectx
is seen by any other code, so it should be avoidable.Although simple on the surface, to be able to strongly update the
queue
field requires many big pieces to work together:ValueDomain.update_offset
must use that uniqueness information to replace the weak (joining) update with a strong (replacing) one here:analyzer/src/cdomains/valueDomain.ml
Lines 900 to 904 in 3137936
calloc
creates an abstract array of blobs:analyzer/src/analyses/base.ml
Line 2261 in 3137936
Therefore that array update also needs to be strong, not weak. In zstd's case (and very commonly),
calloc
is used with a count of 1, so we get an array of length 1. Hence the array domain(s) should be specialized for the length 1 case to just do strong updates instead.(After Fixes/calloc #145 there's probably a bug in that
calloc
code because it always creates an array of length 1, instead of using the count from thecalloc
argument.)The text was updated successfully, but these errors were encountered: