Implement a garbage collector for tags #2479

saethlin · 2022-08-11T01:18:19Z

The general approach here is to scan TLS, all locals, and the main memory map for all provenance, accumulating a HashSet of all pointer tags which are stored anywhere (we also have a special case for panic payloads). Then we iterate over every borrow stack and remove tags which are not in said HashSet, or which could be terminating a SRW block.

Runtime of benchmarks decreases by between 17% and 81%.

GC off:

Benchmark 1: cargo +miri miri run --manifest-path /home/ben/miri/bench-cargo-miri/backtraces/Cargo.toml
  Time (mean ± σ):      7.080 s ±  0.249 s    [User: 6.870 s, System: 0.202 s]
  Range (min … max):    6.933 s …  7.521 s    5 runs
 
  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet PC without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
 
Benchmark 1: cargo +miri miri run --manifest-path /home/ben/miri/bench-cargo-miri/mse/Cargo.toml
  Time (mean ± σ):      1.875 s ±  0.031 s    [User: 1.630 s, System: 0.245 s]
  Range (min … max):    1.825 s …  1.910 s    5 runs
 
Benchmark 1: cargo +miri miri run --manifest-path /home/ben/miri/bench-cargo-miri/serde1/Cargo.toml
  Time (mean ± σ):      2.785 s ±  0.075 s    [User: 2.536 s, System: 0.168 s]
  Range (min … max):    2.698 s …  2.851 s    5 runs
 
Benchmark 1: cargo +miri miri run --manifest-path /home/ben/miri/bench-cargo-miri/serde2/Cargo.toml
  Time (mean ± σ):      6.267 s ±  0.066 s    [User: 6.072 s, System: 0.190 s]
  Range (min … max):    6.152 s …  6.314 s    5 runs
 
Benchmark 1: cargo +miri miri run --manifest-path /home/ben/miri/bench-cargo-miri/slice-get-unchecked/Cargo.toml
  Time (mean ± σ):      4.733 s ±  0.080 s    [User: 4.177 s, System: 0.513 s]
  Range (min … max):    4.681 s …  4.874 s    5 runs
 
Benchmark 1: cargo +miri miri run --manifest-path /home/ben/miri/bench-cargo-miri/unicode/Cargo.toml
  Time (mean ± σ):      3.770 s ±  0.034 s    [User: 3.549 s, System: 0.211 s]
  Range (min … max):    3.724 s …  3.819 s    5 runs

GC on:

Benchmark 1: cargo +miri miri run --manifest-path /home/ben/miri/bench-cargo-miri/backtraces/Cargo.toml
  Time (mean ± σ):      5.886 s ±  0.054 s    [User: 5.696 s, System: 0.182 s]
  Range (min … max):    5.799 s …  5.937 s    5 runs
 
Benchmark 1: cargo +miri miri run --manifest-path /home/ben/miri/bench-cargo-miri/mse/Cargo.toml
  Time (mean ± σ):     936.4 ms ±   7.0 ms    [User: 815.4 ms, System: 119.6 ms]
  Range (min … max):   925.7 ms … 945.0 ms    5 runs
 
Benchmark 1: cargo +miri miri run --manifest-path /home/ben/miri/bench-cargo-miri/serde1/Cargo.toml
  Time (mean ± σ):      2.126 s ±  0.022 s    [User: 1.979 s, System: 0.146 s]
  Range (min … max):    2.089 s …  2.143 s    5 runs
 
Benchmark 1: cargo +miri miri run --manifest-path /home/ben/miri/bench-cargo-miri/serde2/Cargo.toml
  Time (mean ± σ):      4.242 s ±  0.066 s    [User: 4.051 s, System: 0.160 s]
  Range (min … max):    4.196 s …  4.357 s    5 runs
 
Benchmark 1: cargo +miri miri run --manifest-path /home/ben/miri/bench-cargo-miri/slice-get-unchecked/Cargo.toml
  Time (mean ± σ):     907.4 ms ±   2.4 ms    [User: 788.6 ms, System: 118.2 ms]
  Range (min … max):   903.5 ms … 909.4 ms    5 runs
 
Benchmark 1: cargo +miri miri run --manifest-path /home/ben/miri/bench-cargo-miri/unicode/Cargo.toml
  Time (mean ± σ):      1.821 s ±  0.011 s    [User: 1.687 s, System: 0.133 s]
  Range (min … max):    1.802 s …  1.831 s    5 runs

But much more importantly for me this drops the peak memory usage of the first 1 minute of running regex's tests from 103 GB to 1.7 GB.

Thanks to @oli-obk for suggesting a while ago that this was possible and @Darksonn for reminding me that we can just search through memory to find Provenance to locate pointers.

Fixes #1367

RalfJung · 2022-08-20T20:31:53Z

Wow, those are some very nice wins!

The ideas I have for the next-gen aliasing model pretty much require a GC, so I am very happy to see that this is feasible. :)

bors · 2022-08-21T23:42:47Z

☔ The latest upstream changes (presumably #2500) made this pull request unmergeable. Please resolve the merge conflicts.

bors · 2022-08-26T10:04:03Z

☔ The latest upstream changes (presumably #2363) made this pull request unmergeable. Please resolve the merge conflicts.

src/machine.rs

oli-obk · 2022-09-05T20:34:47Z

the PR itself lgtm (at least I remember reviewing it before and marking everything as read XD, the latest changes didn't change that opinion)

saethlin · 2022-09-05T22:19:28Z

Yeah I'm just keeping the branch rebased up so that it continues to work.

saethlin · 2022-09-05T22:20:49Z

Also pre-PR, Ralf had some feedback on the maintainability of the code which finds provenance that is stored in the interpreter runtime so I'll at least wait until he's back to merge this.

saethlin · 2022-09-06T01:45:01Z

Well that certainly made CI times shorter. I'll see if I can put in something that can reassure me that the GC is actually running...

oli-obk · 2022-09-06T07:06:53Z

.github/workflows/ci.yml

@@ -25,12 +25,15 @@ jobs:
          - build: linux64
            os: ubuntu-latest
            host_target: x86_64-unknown-linux-gnu
+            env: MIRIFLAGS=-Zmiri-tag-gc=1


it's weird that this made the linux runner faster, too. It looks like it is actually better to run the GC after every basic block instead of after every 10k blocks.

It's not, that was a combination of me doing things wrong and CI timing noise.

This is a mark-and-sweep GC for a resource which sometimes disposes of itself so we benefit doubly from not running the GC too often. There is fixed cost to the mark stage but the amount we sweep away grows if we wait. Plus if an allocation is allocated then deallocated in the time between the GC runs, we never have to interact with it at all.

I did consider trying to hack up some kind of generation system, but between just ignoring small stacks and stacks not modified since the last GC run, the overhead of the sweep part is quite small at the default GC interval.

saethlin · 2022-09-11T03:15:06Z

I'm reasonably convinced that I have (finally) managed to only adjust the GC interval to run more often in CI for Linux.

oli-obk · 2022-09-13T11:05:49Z

@bors r+

bors · 2022-09-13T11:05:51Z

📌 Commit f59605c has been approved by oli-obk

It is now in the queue for this repository.

bors · 2022-09-13T11:05:58Z

⌛ Testing commit f59605c with merge a00fa96...

oli-obk · 2022-09-13T11:06:36Z

.github/workflows/ci.yml

@@ -34,6 +34,10 @@ jobs:
    steps:
      - uses: actions/checkout@v3

+      - name: Set the tag GC interval to 1 on linux
+        if: runner.os == 'macOS'


the comment refers to linux, but the condition to macOS, slightly confusing

🤦 perfectly fine content for a second PR

bors · 2022-09-13T11:38:58Z

☀️ Test successful - checks-actions
Approved by: oli-obk
Pushing a00fa96 to master...

RalfJung · 2022-09-13T14:19:18Z

This seems like a change that could have waited until after my vacation... Oh well.

RalfJung · 2022-09-21T15:46:57Z

src/stacked_borrows/mod.rs

+    pub fn remove_unreachable_tags(&mut self, live_tags: &FxHashSet<SbTag>) {
+        if self.modified_since_last_gc {
+            for stack in self.stacks.iter_mut_all() {
+                if stack.len() > 64 {


Haha this should definitely be explained in a comment. This is a magic number guesstimated by benchmarking, like the stack-cache length. Skipping over small stacks is a very crude generational garbage collector. I'll definitely make a PR that addresses this later in the day.

RalfJung · 2022-09-21T15:50:07Z

src/stacked_borrows/stack.rs

+            let should_keep = match this.perm() {
+                // SharedReadWrite is the simplest case, if it's unreachable we can just remove it.
+                Permission::SharedReadWrite => tags.contains(&this.tag()),
+                // Only retain a Disabled tag if it is terminating a SharedReadWrite block.


So this is special, we delete them even if they are live! (That's correct of course but I had to read this twice to realize that this match mixes 2 concerns: checking liveness and preventing SRW block merging)

RalfJung · 2022-09-21T15:50:57Z

src/stacked_borrows/stack.rs

+        self.borrows.truncate(write_idx);
+
+        #[cfg(not(feature = "stack-cache"))]
+        drop(first_removed); // This is only needed for the stack-cache


first_removed has a Copy type so I don't think this does anything?

I'm fighting with the fact that this is unused if the stack-cache is off here. The drop just makes dead code detection be quiet. Is there a better approach?

I should mention that we don't have to repair the stack cache. Just tossing the whole thing is a valid approach (it warms up quickly and GC runs are intentionally rare), but I feel like justifying it is awkward. But maybe this issue tips the scales?

The drop just makes dead code detection be quiet. Is there a better approach?

Ah I see. I would use let _unused = ... for that.

RalfJung · 2022-09-21T15:54:43Z

src/tag_gc.rs

+        for thread in this.machine.threads.iter() {
+            if let Some(Scalar::Ptr(
+                Pointer { provenance: Provenance::Concrete { sb, .. }, .. },
+                _,
+            )) = thread.panic_payload
+            {
+                tags.insert(sb);
+            }
+        }
+
+        self.find_tags_in_tls(&mut tags);
+        self.find_tags_in_memory(&mut tags);
+        self.find_tags_in_locals(&mut tags)?;


Can this be factored to use a more general "visit all the values that exist in the machine" kind of operation?

That sounds like a good idea but I don't know how to implement it.

saethlin marked this pull request as draft August 11, 2022 01:18

saethlin force-pushed the tag-gc branch 2 times, most recently from d29e1a6 to 65f79f1 Compare August 16, 2022 12:59

RalfJung added the S-draft Status: still a draft, not yet ready for review label Aug 22, 2022

saethlin force-pushed the tag-gc branch 2 times, most recently from f8f4ca9 to d71ddb4 Compare August 22, 2022 21:47

saethlin force-pushed the tag-gc branch from a833a49 to 664c741 Compare August 26, 2022 23:56

saethlin commented Aug 27, 2022

View reviewed changes

src/machine.rs Outdated Show resolved Hide resolved

saethlin changed the title ~~WIP: -Zmiri-tag-gc, a garbage collector for tags~~ Implement a garbage collector for tags Aug 27, 2022

saethlin marked this pull request as ready for review August 27, 2022 01:13

saethlin force-pushed the tag-gc branch from 664c741 to fa69459 Compare August 27, 2022 01:40

saethlin added S-waiting-on-review Status: Waiting for a review to complete and removed S-draft Status: still a draft, not yet ready for review labels Aug 27, 2022

saethlin force-pushed the tag-gc branch 2 times, most recently from 903d0ba to 0dde6f3 Compare September 5, 2022 18:47

oli-obk self-assigned this Sep 5, 2022

oli-obk reviewed Sep 6, 2022

View reviewed changes

saethlin added S-waiting-on-author Status: Waiting for the PR author to address review comments and removed S-waiting-on-review Status: Waiting for a review to complete labels Sep 10, 2022

saethlin added 2 commits September 10, 2022 23:05

Implement -Zmiri-tag-gc a garbage collector for tags

d61d4c6

In CI set the GC interval to 1 for Linux only

f59605c

saethlin force-pushed the tag-gc branch from f5b4c19 to f59605c Compare September 11, 2022 03:06

saethlin removed the S-waiting-on-author Status: Waiting for the PR author to address review comments label Sep 11, 2022

saethlin added the S-waiting-on-review Status: Waiting for a review to complete label Sep 11, 2022

oli-obk reviewed Sep 13, 2022

View reviewed changes

bors merged commit a00fa96 into rust-lang:master Sep 13, 2022

RalfJung reviewed Sep 21, 2022

View reviewed changes

saethlin deleted the tag-gc branch January 15, 2023 22:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement a garbage collector for tags #2479

Implement a garbage collector for tags #2479

saethlin commented Aug 11, 2022 •

edited

Loading

RalfJung commented Aug 20, 2022

bors commented Aug 21, 2022

bors commented Aug 26, 2022

oli-obk commented Sep 5, 2022

saethlin commented Sep 5, 2022

saethlin commented Sep 5, 2022 •

edited

Loading

saethlin commented Sep 6, 2022

oli-obk Sep 6, 2022

saethlin Sep 11, 2022

saethlin commented Sep 11, 2022

oli-obk commented Sep 13, 2022

bors commented Sep 13, 2022

bors commented Sep 13, 2022

oli-obk Sep 13, 2022

saethlin Sep 13, 2022

bors commented Sep 13, 2022

RalfJung commented Sep 13, 2022 via email

RalfJung Sep 21, 2022

saethlin Sep 21, 2022

RalfJung Sep 21, 2022

RalfJung Sep 21, 2022

saethlin Sep 21, 2022

saethlin Sep 21, 2022 •

edited

Loading

RalfJung Sep 21, 2022

RalfJung Sep 21, 2022

saethlin Sep 21, 2022

Implement a garbage collector for tags #2479

Implement a garbage collector for tags #2479

Conversation

saethlin commented Aug 11, 2022 • edited Loading

RalfJung commented Aug 20, 2022

bors commented Aug 21, 2022

bors commented Aug 26, 2022

oli-obk commented Sep 5, 2022

saethlin commented Sep 5, 2022

saethlin commented Sep 5, 2022 • edited Loading

saethlin commented Sep 6, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

saethlin commented Sep 11, 2022

oli-obk commented Sep 13, 2022

bors commented Sep 13, 2022

bors commented Sep 13, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bors commented Sep 13, 2022

RalfJung commented Sep 13, 2022 via email

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

saethlin Sep 21, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

saethlin commented Aug 11, 2022 •

edited

Loading

saethlin commented Sep 5, 2022 •

edited

Loading

saethlin Sep 21, 2022 •

edited

Loading