-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor and fix race access distribution #1084
Conversation
Currently the order of messages may depend on varinfo IDs in hashtables or whatnot. Since these differ on Linux and OSX, so does the message order.
module Memo = | ||
struct | ||
include Printable.StdLeaf | ||
type t = [`Var of CilType.Varinfo.t | `Type of CilType.Typ.t] * Offset.Unit.t [@@deriving eq, ord, hash] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this a polymorphic variant type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be changed, but I don't want to do it here right now since #1089 and more follow-up already build on this, so it's best to avoid massive conflicts there.
Once all the big usability improvements to the race analysis are done, we'll see which types remain at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely looks a lot more reasonable now. I am a bit unhappy with the number of TODO comments we're adding here though! I don't think we'll ever be in a better situation to address these than now.
None of the added TODOs should be new problems, they're existing (possible) problems, so it's better to mark them than completely ignore them and I cannot fix all of them in a single PR. |
Co-authored-by: Michael Schwarz <michael.schwarz93@gmail.com>
CHANGES: * Add `setjmp`/`longjmp` analysis (goblint/analyzer#887, goblint/analyzer#970, goblint/analyzer#1015, goblint/analyzer#1019). * Refactor race analysis to lazy distribution (goblint/analyzer#1084, goblint/analyzer#1089, goblint/analyzer#1136, goblint/analyzer#1016). * Add thread-unsafe library function call analysis (goblint/analyzer#723, goblint/analyzer#1082). * Add mutex type analysis and mutex API analysis (goblint/analyzer#800, goblint/analyzer#839, goblint/analyzer#1073). * Add interval set domain and string literals domain (goblint/analyzer#901, goblint/analyzer#966, goblint/analyzer#994, goblint/analyzer#1048). * Add affine equalities analysis (goblint/analyzer#592). * Add use-after-free analysis (goblint/analyzer#1050, goblint/analyzer#1114). * Add dead code elimination transformation (goblint/analyzer#850, goblint/analyzer#979). * Add taint analysis for partial contexts (goblint/analyzer#553, goblint/analyzer#952). * Add YAML witness validation via unassume (goblint/analyzer#796, goblint/analyzer#977, goblint/analyzer#1044, goblint/analyzer#1045, goblint/analyzer#1124). * Add incremental analysis rename detection (goblint/analyzer#774, goblint/analyzer#777). * Fix address sets unsoundness (goblint/analyzer#822, goblint/analyzer#967, goblint/analyzer#564, goblint/analyzer#1032, goblint/analyzer#998, goblint/analyzer#1031). * Fix thread escape analysis unsoundness (goblint/analyzer#939, goblint/analyzer#984, goblint/analyzer#1074, goblint/analyzer#1078). * Fix many incremental analysis issues (goblint/analyzer#627, goblint/analyzer#836, goblint/analyzer#835, goblint/analyzer#841, goblint/analyzer#932, goblint/analyzer#678, goblint/analyzer#942, goblint/analyzer#949, goblint/analyzer#950, goblint/analyzer#957, goblint/analyzer#955, goblint/analyzer#954, goblint/analyzer#960, goblint/analyzer#959, goblint/analyzer#1004, goblint/analyzer#558, goblint/analyzer#1010, goblint/analyzer#1091). * Fix server mode for abstract debugging (goblint/analyzer#983, goblint/analyzer#990, goblint/analyzer#997, goblint/analyzer#1000, goblint/analyzer#1001, goblint/analyzer#1013, goblint/analyzer#1018, goblint/analyzer#1017, goblint/analyzer#1026, goblint/analyzer#1027). * Add documentation for configuration JSON schema and OCaml API (goblint/analyzer#999, goblint/analyzer#1054, goblint/analyzer#1055, goblint/analyzer#1053). * Add many library function specifications (goblint/analyzer#962, goblint/analyzer#996, goblint/analyzer#1028, goblint/analyzer#1079, goblint/analyzer#1121, goblint/analyzer#1135, goblint/analyzer#1138). * Add OCaml 5.0 support (goblint/analyzer#1003, goblint/analyzer#945, goblint/analyzer#1162).
This is the result of reverse-engineering the existing race access distribution code. That code is cleaned up, simplified and commented.
This does not yet do anything clever to avoid distributing accesses to many memory locations, but makes the rules for doing so a lot clearer, hopefully simplifying the redesign.
However, the number of accesses, memory locations and races may change as the result of this PR, in particular due to some included soundness fixes.
Changes
Define a new
Access.Memo
type for precisely representing memory locations as used by the race analysis.The previous representation used two
options
, of which not all four combinations were possible, and both of them could contain offsets, which were duplicated.The new representation has a
varinfo
ortyp
as root, followed by unit-indexed offsets. This also allows reusing features from Refactor offsets, lvals and addresses #1067.Completely rewrite
Access.add_propagate
asadd_distribute_outer
.Its previous logic was very confusing and completely unexplained. In some ways it had an overly specific special case, while the general case was both too precise or too imprecise, depending on the situation. The new logic generalizes this to two general principles according to which type-based accesses are distributed:
The new logic fixes both soundness and precision issues that existed previously.
Fix unsoundness regarding distribution to struct fields with array type.
Previously accesses via
int*
didn't account for the possibility that it might point to and thus race with an element of a struct field with typeint[2]
. Now arbitrarily deep multi-arrays are unwrapped.Rename and flip
ana.mutex.disjoint_types
toana.race.direct-arithmetic
.This should somewhat clarify the meaning of the option and avoid double negation (
not !unsound
). Disjoint types is rather nondescriptive since it really just controls the behavior of direct (i.e. non-field) accesses to arithmetic-type–based memory locations.Moves the handling of
ana.race.direct-arithmetic
to just one place.Previously two partial checks were in two different places, obfuscating the logic.
Exclude accesses to some anonymous pthread type internals, which show up as
__anonstruct
and__anonunion
.Existing exclusions are based on
TNamed
, so simplytypedef
names. But anonymous structs are precisely the ones that don't have such names.Remove polymorphic
Hashtbl
usage fromAccess
.Although it didn't cause issues as the hashtables were with
typsig
keys, it's still nicer to avoid the possibility explicitly.typsig
s are still used to make sure that all type-based accesses are collected regardless of anytypedef
s.Add
access
tracing.Add option
warn.deterministic
.This makes the order of printed messages deterministic, by delaying the printing and later printing them in sorted (by text, etc) order. This is useful for writing cram tests such that the test doesn't spuriously break (e.g. on OSX) due to different
varinfo
IDs being in different order in some hashtables that get iterated.TODO