Use PackedFingerprint in DepNode to reduce memory consumption #78646

tgnottingham · 2020-11-01T23:02:42Z

No description provided.

rust-highfive · 2020-11-01T23:02:45Z

(rust_highfive has picked a reviewer for you, use r? to override)

tgnottingham · 2020-11-01T23:17:27Z

This is a second attempt. The first was #78516, and was abandoned due to the performance implications of using a [u8; 16] representation, which leads to aliasing pessimizations, at least.

This uses #[repr(packed)] instead, which I assumed would be fraught with UB and performance perils versus using [u8; 16], but turns out to be much simpler and much more performant, based on local tests.

jyn514 · 2020-11-01T23:37:11Z

@bors try @rust-timer queue

rust-timer · 2020-11-01T23:37:13Z

Awaiting bors try build completion

bors · 2020-11-01T23:37:21Z

⌛ Trying commit e3e158bfff558b96c9ba1719249153b12e999705 with merge cef34ea6b56cd5bc7d0ee0e3c916be77ef292b35...

bors · 2020-11-02T00:20:24Z

☀️ Try build successful - checks-actions
Build commit: cef34ea6b56cd5bc7d0ee0e3c916be77ef292b35 (cef34ea6b56cd5bc7d0ee0e3c916be77ef292b35)

rust-timer · 2020-11-02T00:20:26Z

Queued cef34ea6b56cd5bc7d0ee0e3c916be77ef292b35 with parent b202532, future comparison URL.

rust-timer · 2020-11-02T02:14:59Z

Finished benchmarking try commit (cef34ea6b56cd5bc7d0ee0e3c916be77ef292b35): comparison url.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying rollup- to bors.

Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up.

@bors rollup=never
@rustbot modify labels: +S-waiting-on-review -S-waiting-on-perf

jyn514 · 2020-11-02T02:16:19Z

Wow, that's a big difference! Up to 10% memory gains, almost no increased instruction count :D

tgnottingham · 2020-11-02T06:13:08Z

Wow, that's a big difference! Up to 10% memory gains, almost no increased instruction count :D

Yeah! :) And I just added a relatively simple change to the dep graph code that should net a couple more percent.

This new change splits up dep node data to pack it more tightly (which is made easier by the fingerprint having lower alignment,
which is why I thought the change seemed related enough to justify including in this PR). There was prior work in this area (#57061), but going in the opposite direction (combining dep node data into one struct).

The trade off is memory usage versus spatial locality. My hope is that the locality doesn't matter that much. In normal circumstances, I think we only access a node's data twice in this structure during compilation -- once when inserting, and once when serializing. So it seems unlikely to matter.

Xanewok · 2020-11-02T09:08:17Z

Wow, that's a big difference! Up to 10% memory gains, almost no increased instruction count :D

If I understand correctly, changing this to packed will result in "unaligned" loads (it's just a byte stream but the load is not on a word boundary aiui), which themselves won't bump the instruction count but rather incur latency penalty. If we look at cycles we can see a 1-3% regression across the board.

jyn514 · 2020-11-02T22:52:42Z

This new change splits up dep node data to pack it more tightly (which is made easier by the fingerprint having lower alignment,
which is why I thought the change seemed related enough to justify including in this PR).

@bors try @rust-timer queue

rust-timer · 2020-11-02T22:52:44Z

Awaiting bors try build completion

bors · 2020-11-02T22:52:57Z

⌛ Trying commit d97d5de6606256727454669c077febc5b241bf45 with merge e6e801fdf71de4055423874e81a84b833f85e59c...

bors · 2020-11-02T23:37:47Z

☀️ Try build successful - checks-actions
Build commit: e6e801fdf71de4055423874e81a84b833f85e59c (e6e801fdf71de4055423874e81a84b833f85e59c)

rust-timer · 2020-11-02T23:37:48Z

Queued e6e801fdf71de4055423874e81a84b833f85e59c with parent 338f939, future comparison URL.

tgnottingham · 2020-11-02T23:57:34Z

If I understand correctly, changing this to packed will result in "unaligned" loads (it's just a byte stream but the load is not on a word boundary aiui), which themselves won't bump the instruction count but rather incur latency penalty. If we look at cycles we can see a 1-3% regression across the board.

And x86 is probably the best case scenario. Still, if we can mitigate the cycle increase, or if the pros outweigh the cons, it may be worth packing Fingerprints on x86 specifically.

rust-timer · 2020-11-03T01:15:13Z

Finished benchmarking try commit (e6e801fdf71de4055423874e81a84b833f85e59c): comparison url.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying rollup- to bors.

Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up.

@bors rollup=never
@rustbot modify labels: +S-waiting-on-review -S-waiting-on-perf

tgnottingham · 2020-11-03T07:54:08Z

The additional memory savings from the second change get lost in the variance in the benchmarks. Profiling locally verifies the modified structures to be smaller. I don't think the second change is worth the added complexity, at least while there are bigger improvements to be had elsewhere.

The max-rss measurement for keccak has very high variance in my experience, so you might take it with a grain of salt.

The cycles look a bit better, but I'm guessing that's also attributable to variance. May have gotten unlucky the first time, or lucky the second time. Nevertheless, I'll experiment with some other approaches that may perform better.

nnethercote · 2020-11-18T04:27:12Z

Sorry for the slow response, I am on leave and didn't see this until just now.

The perf results look good, but I am worried about the UB mentioned here. Any thoughts on that?

tgnottingham · 2020-11-18T05:26:02Z

Yes, that warning concerned me as well. I know that the unaligned_references and safe_packed_borrows lints detect when you take references to packed fields (including inside #[derive]'d trait implementations), and I believe that is the only UB concern. Those lints aren't firing, and they were before I made the required changes, so that's some assurance. That said, perhaps they're not complete. I'll ask around for some additional assurance.

nnethercote · 2020-11-18T08:45:00Z

Ok, r=me once you've finished asking around.

tgnottingham · 2020-11-18T22:44:49Z

I've made the Fingerprint field of the PackedFingerprint private to further assuage fears that a client could take a reference to the packed field without the lints catching it. The PackedFingerprint implementation itself should be safe, as care is taken to avoid taking such references, and as I understand it, the #[derive]'d trait implementations do the same when the type is Copy.

Outside of this PR, I'm working on some changes that could make this one more or less impactful, so I'll revisit whether packing continues to be worthwhile if those changes get integrated. I know packing isn't a thing to be taken lightly, but for now, it looks like a pretty clear win.

nnethercote · 2020-11-19T22:28:37Z

Outside of this PR, I'm working on some changes that could make this one more or less impactful, so I'll revisit whether packing continues to be worthwhile if those changes get integrated.

Do you still want to land this now?

tgnottingham · 2020-11-20T00:53:28Z

Yes--it may be a while before I can finish those other changes.

@RalfJung, I know you've been involved with work and discussions around references to packed fields and UB. Do you have any concerns about landing this PR?

RalfJung · 2020-11-20T08:24:19Z

I recommend if you use packed fields you should set the lint added by #72270 to deny in the relevant crates:

#![deny(unaligned_references)]

RalfJung · 2020-11-20T08:26:15Z

Ah, looks like you already knew that.^^ But the lint does not seem to be enabled in rustc.

To my knowledge, if the packed fields are private and the crate defining the struct has that lint enabled, that should guarantee soundness.

To detect misuse of private packed field in `PackedFingerprint`.

tgnottingham · 2020-11-20T09:18:46Z

Thanks! Added the #[deny].

nnethercote · 2020-11-20T09:30:43Z

@bors r+

bors · 2020-11-20T09:30:45Z

📌 Commit 142932a has been approved by nnethercote

bors · 2020-11-20T11:11:24Z

⌛ Testing commit 142932a with merge ae6aa22...

bors · 2020-11-20T13:42:38Z

☀️ Test successful - checks-actions
Approved by: nnethercote
Pushing ae6aa22 to master...

tgnottingham · 2021-01-21T20:29:56Z

Outside of this PR, I'm working on some changes that could make this one more or less impactful, so I'll revisit whether packing continues to be worthwhile if those changes get integrated.

Those changes have landed in #79589 and #80957, and I've revisted the perf effect of reverting this change in #81230. I think it continues to carry it's weight.

rust-highfive assigned eddyb Nov 1, 2020

rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Nov 1, 2020

jyn514 added I-compiletime Issue: Problems and improvements with respect to compile times. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Nov 1, 2020

tgnottingham mentioned this pull request Nov 1, 2020

Access query (DepKind) metadata through fields #78452

Merged

camelid added I-compilemem Issue: Problems and improvements with respect to memory usage during compilation. and removed I-compiletime Issue: Problems and improvements with respect to compile times. labels Nov 2, 2020

tgnottingham force-pushed the packed_fingerprints branch from 9188c33 to d97d5de Compare November 2, 2020 06:26

tgnottingham force-pushed the packed_fingerprints branch from d97d5de to 5e5137d Compare November 4, 2020 06:36

rust-highfive assigned nnethercote Nov 4, 2020

Use PackedFingerprint in DepNode to reduce memory consumption

f09d474

tgnottingham force-pushed the packed_fingerprints branch from 601c942 to 08c891a Compare November 18, 2020 22:34

Make PackedFingerprint's Fingerprint private

05dde13

tgnottingham force-pushed the packed_fingerprints branch 2 times, most recently from 08c891a to a0eaf27 Compare November 18, 2020 23:23

Set unaligned_references lint to deny in rustc_data_structures

142932a

To detect misuse of private packed field in `PackedFingerprint`.

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Nov 20, 2020

bors added the merged-by-bors This PR was explicitly merged by bors. label Nov 20, 2020

bors merged commit ae6aa22 into rust-lang:master Nov 20, 2020

rustbot added this to the 1.50.0 milestone Nov 20, 2020

tgnottingham deleted the packed_fingerprints branch January 20, 2021 18:32

tgnottingham mentioned this pull request Jan 20, 2021

[do not merge] Remove PackedFingerprint #81230

Closed

Use PackedFingerprint in DepNode to reduce memory consumption #78646

Use PackedFingerprint in DepNode to reduce memory consumption #78646

Uh oh!

Conversation

tgnottingham commented Nov 1, 2020

Uh oh!

rust-highfive commented Nov 1, 2020

Uh oh!

tgnottingham commented Nov 1, 2020

Uh oh!

jyn514 commented Nov 1, 2020

Uh oh!

rust-timer commented Nov 1, 2020

Uh oh!

bors commented Nov 1, 2020

Uh oh!

bors commented Nov 2, 2020

Uh oh!

rust-timer commented Nov 2, 2020

Uh oh!

rust-timer commented Nov 2, 2020

Uh oh!

jyn514 commented Nov 2, 2020

Uh oh!

tgnottingham commented Nov 2, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Xanewok commented Nov 2, 2020

Uh oh!

jyn514 commented Nov 2, 2020

Uh oh!

rust-timer commented Nov 2, 2020

Uh oh!

bors commented Nov 2, 2020

Uh oh!

bors commented Nov 2, 2020

Uh oh!

rust-timer commented Nov 2, 2020

Uh oh!

tgnottingham commented Nov 2, 2020

Uh oh!

rust-timer commented Nov 3, 2020

Uh oh!

tgnottingham commented Nov 3, 2020

Uh oh!

nnethercote commented Nov 18, 2020

Uh oh!

tgnottingham commented Nov 18, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nnethercote commented Nov 18, 2020

Uh oh!

tgnottingham commented Nov 18, 2020

Uh oh!

nnethercote commented Nov 19, 2020

Uh oh!

tgnottingham commented Nov 20, 2020

Uh oh!

RalfJung commented Nov 20, 2020

Uh oh!

RalfJung commented Nov 20, 2020

Uh oh!

tgnottingham commented Nov 20, 2020

Uh oh!

nnethercote commented Nov 20, 2020

Uh oh!

bors commented Nov 20, 2020

Uh oh!

bors commented Nov 20, 2020

Uh oh!

bors commented Nov 20, 2020

Uh oh!

tgnottingham commented Jan 21, 2021

Uh oh!

Uh oh!

tgnottingham commented Nov 2, 2020 •

edited

Loading

tgnottingham commented Nov 18, 2020 •

edited

Loading