-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Interactive: improvements for chrony story #724
Conversation
Previously "changing" a constant could change anything.
This makes it match default chrony analysis.
src/analyses/varEq.ml
Outdated
| BinOp ((PlusPI | IndexPI), e1, e2, _) -> | ||
eq_set_clos e1 s (* TODO: what about e2? add to some Index offset to all? *) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you maybe comment on the impact that this has now? Are we tracking equalities involving pointer arithmetic? If yes, how confident are we that we do useful things?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These additions aren't changing the equalities that are tracked, but just allows them to be used with array indexing via pointer addition (which CIL sometimes inserts instead of actual Index
offset that the analysis already handles).
It's needed for the following pattern from one of the added tests:
pthread_mutex_lock(&entry->refs_mutex);
entry[0].refs++; // NORACE
pthread_mutex_unlock(&entry->refs_mutex);
The symb_locks analysis uses the EqualSet
query to look up equalities for entry[0].refs
, but the old behavior meant that an empty set was returned, so we couldn't infer that it has the same prefix as the locking expression.
A less trivial case of it comes from the minimized chrony test, where the locking actually happens outside the function where the access is, so there's already an equality ip_addrs = inst->addresses
known to the analysis and there's a symbolic lock inst->mutex
, but then the access ip_addrs[0]
isn't found to be protected because to find that out, the equality needs to be used inside the index expression.
Note that the index itself is completely irrelevant because all that's needed is the common inst
prefix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we are a bit too eager to declare access as non-racy here, consider the example I added:
void cache_entry_addref(struct cache_entry *entry) {
pthread_mutex_lock(&entry->refs_mutex);
entry->refs++; // NORACE
(*entry).refs++; // NORACE
entry[1].refs++; // RACE
pthread_mutex_unlock(&entry->refs_mutex);
}
These accesses at entry[1]
are racy, as the protecting mutex for that element is not held (I fixed the loop to only run until 9, so it's not UB). Also passing a t*
and then using array access on it is a very common idiom in C afaik.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's quite an evil example, but I'm rather skeptical that we can do much about this. The function in isolation, as presented here, just gets a pointer and then accesses entry[1]
. For an arbitrary pointer that may point to any memory and therefore a write to it may invalidate any memory, including any other program variable (any global, any local of any stack frame, any dynamically allocated memory). Of course in this particular case it doesn't, but in Goblint we have no way of knowing that, because for addresses (like the pointer here) we don't keep track of their context (whether they're pointing into an array or not).
I think our much broader assumption is that pointer arithmetic stays "within the direct object" (for the lack of a better definition) regardless of its context, because we simply don't know about the broader memory layout (even if in this case it seems obvious). To make weaker assumptions than that, we would either need some crazy possible memory layout analysis or, lacking that, assume the worst and invalidate everything (not just globals!).
Also passing a
t*
and then using array access on it is a very common idiom in C afaik.
Is it though? If you want to use it as an array, you might as well pass it as an array (which is a pointer under the hood anyway), to make it clear that it's an array (usually with another length argument), not a pointer to an isolated object.
Now thinking about bad things, if the argument were an array and we copy the array domain into the function and mutated it, do those writes even propagate back to the caller's array domain?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the assumption we should be making is that
-
- pointer arithmetic never leaves the corresponding varinfo (or blob for malloced things),
-
- and never can cross struct boundaries once one has descended (because the compiler may have inserted padding), so no adding magic offsets to get from struct member
a
toa
in the next element of the array unlessa
is the first struct element.
That seems not too strong but also not as radical as setting everything to top. (And should be possible as our addresses maintain offsets and have abstract values at the granularity of variables anyway).
- and never can cross struct boundaries once one has descended (because the compiler may have inserted padding), so no adding magic offsets to get from struct member
I think if we assume anything more, we cannot seriously claim we are sound anymore. I personally would be extremely unhappy with Goblint accepting this program as race-free! (How do @vesalvojdani @stilscher @jerhard feel here?)
Should we put this on the agenda at the Gobcon?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now thinking about bad things, if the argument were an array and we copy the array domain into the function and mutated it, do those writes even propagate back to the caller's array domain?
I'd hope so. We pass a pointer to the array, hence the entire(!) array will go to the callee, we update things there and should then copy the value over in combine?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or maybe this is somehow also related to #434? Because the thing is, this change simply allows the entry[0]
or entry[1]
expression to preserve its prefix (previously we gave up on index access of pointer). Index access on array type has been previously supported for a long time and this shouldn't be different: symbolic locking happens on the basis of common prefix between the mutex and the access (and we have at least one paper on this!). So I'm quite surprised how this can suddenly be going wrong.
cc: @vesalvojdani
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd hope so. We pass a pointer to the array, hence the entire(!) array will go to the callee, we update things there and should then copy the value over in combine?
I haven't dug into this now, but this is exactly where I'm worried. Sure, if you take a pointer of the array and pass that to a function, it gets passed along as reachable and combine
d back. But what if you pass the array directly (without making a pointer to it first). In the caller, the local domain contains an array abstraction directly under the array variable name, there isn't an address domain containing an address to the array contents. This is analogous to how an integer variable contains the integer abstraction directly. And when they are passed as arguments, they are copied — there's a copy of the integer's abstraction directly under the callee's argument and copy of the array's abstraction directly under the callee's argument. On combine
, the internal integer abstraction is simply removed and so would the callee's (modified) array one. Or do we have some special logic somewhere there specifically for arrays? I don't recall at least.
I personally would be extremely unhappy with Goblint accepting this program as race-free!
Indeed in the non-UB test you added, we should not. It's just the symbolic locks that then might be assuming something stronger.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just realized there's a granularity difference at play here as well:
- The
cache_entry
tests in our suite use an array of structs, where each struct has a mutex protecting its data. - The chrony
DNS_Async_Instance
struct, for which this is needed, is a standalone struct (not one in an array), where the single mutex protects an entire array inside the struct.
Because they are structurally different like that, I'd hope we can actually be sound for the tests you added and the story in chrony. I'll have to dig into how exactly these different granularity locking schemes, and per-field vs per-index symbolic locks are supposed to work, because these cases involve both at the same time (but in different order!).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The case for chrony is easier than the more general one, where the unsoundness problem came in. I now fixed that using very ad-hoc handling that's still sufficient for chrony.
This fixes the soundness, but still works for chrony story.
…6-symbeq/38-chrony-name2ipaddress
Changes
IndexPI
expressions.__goblint_assume_join
special Goblint builtin to threadJoins analysis to annotate thread being joined even if Goblint doesn't consider it unique. This can be used as an annotation after apthread_join
loop.realloc
argument read and free accesses non-transitive (we really need Redesign ofLibraryFunctions
specification #719).sem.unknown_function.read.args
for also ignoring read accesses from unknown function arguments (sem.unknown_function.invalidate.args
from Add option sem.unknown_function.invalidate.args #707 only concerns invalidation, i.e. writing).