-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Backtracking optimizations #3
base: main
Are you sure you want to change the base?
Conversation
src/tracedAtomic.ml
Outdated
|
||
let make v = if !tracing then perform (Make v) else | ||
begin | ||
let i = !atomics_counter in | ||
atomics_counter := !atomics_counter + 1; | ||
(Atomic.make v, i) | ||
atomics_counter := !atomics_counter - 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not added in this PR, but I think we should change atomics_counter
to atomic - otherwise might race outside tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes absolutely, I'll switch the buggy ref to atomic!
(For context, I went with negative identifiers there to detect that a trace
is interacting with an atomic created outside the tested function, but didn't follow up yet on that line of thoughts)
src/tracedAtomic.ml
Outdated
| Read_write, Some ptr -> add ptr last_read, add ptr last_write | ||
| _ -> assert false | ||
in | ||
let new_clock = IdMap.add j new_time clock in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clock
, new_clock
look to have no meaningful use
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yes you are right! I'll remove them :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed!
Thanks for the PR, very cool stuff! I'm still reviewing it - feel free handle the comments as they flow in or all at once later on. Fwiw, I've run a lot of test traces already and it's holding up perfectly well. |
src/tracedAtomic.ml
Outdated
if List.for_all causal replay_steps | ||
then if IdSet.mem proc_id pre_s.enabled | ||
then Some replay_steps | ||
else let is_parent k s = k > lower && k < time - upper && s.run.op = Spawn && s.run.obj_ptr = Some proc_id in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the k > lower && k < time - upper
condition needed? That is, isn't a Spawn
of some proc_id
unique within a single execution (state)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes the check on the interval is a defensive measure, we can't always backtrack to before our spawn... This whole function is really hard to read though :/
src/tracedAtomic.ml
Outdated
while IdMap.(cardinal (map_diff_set s.backtrack !dones)) > 0 do | ||
let j, new_steps = IdMap.min_binding (map_diff_set s.backtrack !dones) in | ||
let new_explored = | ||
if !is_backtracking || state_planned <> [] then !dones else IdSet.empty in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will assign empty set to new_explored
if not backtracking && state_planned is empty
but I think !dones
has to be an empty set in such a case anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree!
dones := explored ; | ||
s.backtrack <- IdMap.singleton proc_id state_planned | ||
end ; | ||
let is_backtracking = ref false in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand this correctly, s.backtrack
is "overloaded" to also handle the initial case, which is not backtracking but just continuing existing execution. Then loop relies on this always being the first element in s.backtrack
, hence above boolean for special treatment. Do you reckon it'd be worth trying to split up these separate cases?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code kept evolving in the "round-robin" branch where we don't have to remember if we are backtracking or not (... the code ain't perfect there either though!) Do you think it's worth fixing in this PR?
src/tracedAtomic.ml
Outdated
| Some lst -> List.length lst > List.length replay_steps | ||
then pre_s.backtrack <- IdMap.add j replay_steps pre_s.backtrack | ||
|
||
let map_diff_set map set = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perhaps map_subtract_set
is more accurate?
Rebased and fixed some issues, thanks again for the review! :) However, I then stumbled upon a new test where |
I have a test that (correctly) fails with
With
With this PR:
However, I'm probably just confused about how this is supposed to be used. I opened #13 with some questions about that. |
OK, here's a simpler test-case that detects the bug with module Atomic = Dscheck.TracedAtomic
let test () =
let cancelled = Atomic.make false in
let max_requests = Atomic.make 0 in
Atomic.spawn (fun () ->
Atomic.set cancelled true;
Atomic.decr max_requests;
);
Atomic.spawn (fun () ->
ignore (Atomic.get max_requests);
assert (Atomic.get cancelled) (* This bug should be detected *)
)
let () = Atomic.trace test With
With this PR:
|
Thanks for the two counter examples! I ended up rewriting the backstepping algorithm to be a lot more straightforward (... it's still too tricky for my liking, but the structure makes it harder for it to go wrong by missing a branch) It's of course slower than before because it doesn't wrongly skip as many traces, but it can still complete the |
(Follow-up on #2 )
Sorry for the big dirty PR! I was inspired to optimize the backtracking and it now terminates in a few seconds on the examples that I care about:
Atomic.get
(we only want to backtrack when the atomic usages could yield a different outcome, so a "read" with a "write" etc.)(A3)
with(B3)
then it would bubble theB
operations one at a time:It now does a "big" backstep by scheduling all the operations to permute in the backtracking:
This wasn't so simple though, as now the atomics and spawns can be created in a different order, changing their globally unique
int
identifiers. The fix to give them stable ids was to track the creations locally in each thread rather than globally (so rather than "I'm the Xth atomic ever created" we do "I'm the Xth atomic created by thread Yth... which was itself created by thread Zth from the root" with X,Y,Z local counters to each parent threads).Furthermore, we sometimes want to backstep a thread operation to a point where the thread didn't exist yet... and so we need to also backstep its parent operations that led to our thread creation.
Anyway, I tested it intensively and hopefully I didn't mess up too badly but this was quite involved so there might be some bugs lurking... Let me know if I can do anything to ease the review :)