Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option for ignoring races from free #695

Merged
merged 6 commits into from
May 3, 2022
Merged

Add option for ignoring races from free #695

merged 6 commits into from
May 3, 2022

Conversation

sim642
Copy link
Member

@sim642 sim642 commented Apr 25, 2022

Currently free is defined to access its argument as a write in LibraryFunctions. The reason being that after freeing the memory, it may be allocated and written by something else. However, this is quite annoying for race detection because freeing memory causes write accesses to everything reachable. Although there may be "data races" like that, it's very unlikely to be what the user is interested in. Really it calls for a separate use-after-free analysis, rather than being specific to races.

Therefore this PR adds an option to ignore such free accesses from being considered writes. Hopefully this avoids inflating our race numbers in real-world programs with all these spurious races. Currently any cleanup code will cause such "races" because we cannot do anything to rule out use-after-free in multithreaded code.

Changes

  1. Add option ana.race.free (true by default to preserve current behavior as default) to choose whether free accesses are considered as racing.
  2. Add Free invalidation action to LibraryFunctions.
  3. Extract LibraryFunctions.action to AccessKind.
  4. Replace write boolean with AccessKind.t throughout access and race detection code.
  5. Fix type-related unsoundness for direct alloc variable races? Since alloc variables have void type and all accesses are partitioned by type, some accesses in 04-mutex/64-free_direct_rc had type int while others had void, and no race was found.

TODO

  • The fix to type-related direct alloc variable races essentially strips all types from access memory locations that are known (instead of being abstractly related to just a struct type). This causes sudden imprecision in other tests, because apparently with unknown offsets, the types have previously been used to separate accesses to fields of different type even when the specific field is unknown. I'm not sure of there's any way to keep that though.

    No fix, trade precision for soundness.

@sim642
Copy link
Member Author

sim642 commented Apr 25, 2022

Trying this on zstd, a massive number of accesses turn into frees instead of writes, but very disappointingly it makes almost no difference to detected races:

 Summary for all memory locations:
-	safe:         1990
-	vulnerable:    192
+	safe:         2001
+	vulnerable:    181
 	unsafe:        863
 	-------------------
 	total:        3045

So among memory locations with free accesses, there are still other non-free ones which keep racing.

@sim642 sim642 mentioned this pull request Apr 25, 2022
3 tasks
@michael-schwarz michael-schwarz self-requested a review April 26, 2022 14:12
sim642 added 2 commits April 27, 2022 09:28
…access partitioning

That partitioning was unsound for alloc variables (with void type in some cases), so had to be disabled to trade precision for soundness.
@vesalvojdani
Copy link
Member

This is a very good change. We should probably merge this and then think about whether one should distinguish these direct versus indirect races. The indirect ones are potential use-after-free (CWE-416) and it might be good to flag them as such.

@sim642
Copy link
Member Author

sim642 commented May 2, 2022

We should probably merge this and then think about whether one should distinguish these direct versus indirect races.

Actually in #696 I had a possible TODO point that I skipped for now, which relates to this: when freeing memory, only the directly pointed memory is freed (and thus accessed in this manner). If a freed struct transitively contains pointers to other data, free does not recursively access all of them. The correct coding pattern would be to free all the internals before the outer structure.

So actually free could also be (hard-coded for the time being) made to only emit direct (reach = false) accesses. But if we're going to turn this PR's option off for interactive benchmarking anyway, then all the accesses of free kind are irrelevant (only the number of source locations per safe memory location would change).

@sim642 sim642 merged commit 9ee1de1 into master May 3, 2022
@sim642 sim642 deleted the access-free branch May 3, 2022 14:48
@sim642 sim642 added this to the v2.0.0 milestone Aug 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants