Compress most of spans to 32 bits #44646

petrochenkov · 2017-09-17T04:18:03Z

As described in https://internals.rust-lang.org/t/rfc-compiler-refactoring-spans/1357/28

petrochenkov · 2017-09-17T04:21:35Z

TODO: Measure difference in actual memory consumption, especially on large crates.
82.68% of spans are "inlined" in the compiler and standard library crates, but that's an indirect measure and it needs to be verified on something else.

~~TODO2: Remove accidental RLS update~~

petrochenkov · 2017-09-17T04:24:37Z

src/librustc/util/common.rs

@@ -61,7 +60,7 @@ pub enum ProfileQueriesMsg {
    /// end a task
    TaskEnd,
    /// begin a new query
-    QueryBegin(Span, QueryMsg),
+    QueryBegin(QueryMsg),


Span is not Send/Sync if it uses thread-local interner and profile queries are sent to other threads, so I had to temporary remove spans from them.
I'll restore this back later, queries will have to use SpanData instead of Span (I hoped to keep it private, but it looks like there's no better choice :( ).

petrochenkov · 2017-09-17T04:27:16Z

src/libsyntax_pos/span_encoding.rs

+        }
+        _ => unreachable!()
+    };
+    SpanData { lo: BytePos(base), hi: BytePos(base + len), ctxt: SyntaxContext(ctxt) }


If encoding/decoding with 2-bit tag looks too error-prone/unmaintainable (or turns out to be too slow), then I can redo this with 1-bit tag instead, it will reduce the persentage of inlined spans from 82.68% to 80.01% (on rustc/libstd data).

petrochenkov · 2017-09-17T04:30:07Z

src/libsyntax_pos/lib.rs

 }

-#[allow(deprecated)]
-pub const DUMMY_SP: Span = Span { lo: BytePos(0), hi: BytePos(0), ctxt: NO_EXPANSION };
+#[derive(Clone, Copy, PartialEq, Eq, Hash)]


@michaelwoerister
I'm not sure if Span can derive Hash or not (for incremental, etc).
Is hashing the 32-bit "index" enough for interned spans, or actual data must be hashed?

Incremental uses HashStable anyway.

i.e. this impl:

rust/src/librustc/ich/hcx.rs

Line 234 in ef227f5

impl<'a, 'gcx, 'tcx> HashStable<StableHashingContext<'a, 'gcx, 'tcx>> for Span {

Yes, as @arielb1 says. We keep hashing for incr. comp. separate because often it's a lot expensive than what we need for a regular hash table and it often requires additional contextual information. So you don't have to worry about it.

You might want to provide a specialized implementation that hashes just one 32 bit value instead for bytes though (unless we do that anyway). But probably not worth the trouble.

arielb1 · 2017-09-17T10:35:42Z

Why are you using a [u8; 4]? Is that to make it dealigned and allow slipping through smaller holes? Can't you use a #[repr(packed)] u32 and gain safety?

petrochenkov · 2017-09-17T11:56:39Z

@arielb1

Why are you using a [u8; 4]? Is that to make it dealigned and allow slipping through smaller holes?

Yes.

Can't you use a #[repr(packed)] u32 and gain safety?

Are reads from packed structs guaranteed to perform unaligned loads correctly?
Does #27060 happens only if reference is taken, passed somewhere and then dereferenced?
I wasn't sure, so I avoided packed. If packed works, I'd certainly prefer it to the union trick.

Mark-Simulacrum · 2017-09-17T13:37:01Z

We can run this through perf.rlo's benchmarks when ready, just run a try build and ping me on completion.

arielb1 · 2017-09-17T14:03:18Z

Are reads from packed structs guaranteed to perform unaligned loads correctly?
Does #27060 happens only if reference is taken, passed somewhere and then dereferenced?

That's right - otherwise packed structs would be quite useless.

And #27060 (which I'm working on fixing right now) only means that some code that should be unsafe is allowed, so if you break it in any way that should be easy to fix.

bors · 2017-09-17T16:32:03Z

☔ The latest upstream changes (presumably #44654) made this pull request unmergeable. Please resolve the merge conflicts.

michaelwoerister · 2017-09-18T16:10:14Z

This looks awesome, @petrochenkov!
Like @arielb1, I'd also be interested in the performance impact.

petrochenkov · 2017-09-18T23:55:56Z

Updated.
@bors try (for performance testing)

bors · 2017-09-18T23:56:05Z

⌛ Trying commit f069c88 with merge 1125280...

@michaelwoerister

Compress most of spans to 32 bits As described in https://internals.rust-lang.org/t/rfc-compiler-refactoring-spans/1357/28 Closes #15594 r? @michaelwoerister

bors · 2017-09-19T02:16:08Z

☀️ Test successful - status-travis
State: approved= try=True

michaelwoerister · 2017-09-19T09:08:19Z

@arielb1, I tried typing the merge commit hashes into perf.rlo compare but that didn't seem to work for me. Did you have a different method for performance comparison in mind?

petrochenkov · 2017-09-19T10:02:37Z

We can run this through perf.rlo's benchmarks when ready, just run a try build and ping me on completion.

ping @Mark-Simulacrum

Mark-Simulacrum · 2017-09-19T12:20:15Z

Results: http://perf.rust-lang.org/compare.html?commit_a=0701b37d97d08da7074ece7a7dcb4449498f4bfa&commit_b=11252805f7b1f43e0ae673ab1ed1757d801222ef&stat=instructions%3Au.

llogiq · 2017-09-19T17:19:51Z

I notice that the interner uses a default HashMap. Perhaps we can further speed it up using an optimized hash, given that the SpanData to hash is really small?

petrochenkov · 2017-09-20T00:13:50Z

I looked through the performance data a bit.

What we try to optimize is memory, i.e. max-rss.
The situation looks pretty good, tests use less memory, mostly.
Three regressions are regex-0.1.80@020-incr-from-scr..., regex-0.1.80@080-SparseSet, syntex-0.42.2@000-base. I'm not sure how to interpret this, on one hand regex and syntex are large crates so they can hit the limit for lo and start actively use interner, on the other hand many other tests using regex and syntex doesn't show regressions.

On the other side there are more regressions in speed. I don't know what exactly the tests measure (e.g. how cpu-clock and cycles-u are different), but the speed problem can be mitigated by 1) using 1-bit tag which makes encoding/decoding much more simple, 2) using faster hash function in interner as @llogiq suggested, 3) use span.data() + span_data.lo + span_data.hi more actively when you need e.g. span.lo() and span.hi() in the same place.

michaelwoerister · 2017-09-20T09:06:56Z

Many of the tests showing regressions are ones with incremental compilation activated. Incremental compilation has to expand all spans to file:line:col during change detection, so it will do more decoding than other cases. That might explain why these tests show up. That being said, in the future we will probably change how exactly spans are treated by change detection, so I don't think we should give too much weight to the incr. comp. tests right now.

Great find, @llogiq! This should definitely use rustc_data_structures::fx::FxHashMap.

petrochenkov · 2017-09-20T14:12:35Z

@bors try
Let's try 1-bit tag before doing other optimizations.

bors · 2017-09-20T14:12:39Z

🔒 Merge conflict

petrochenkov · 2017-09-20T14:30:25Z

@bors try

bors · 2017-09-20T14:30:35Z

⌛ Trying commit cb1158f with merge 39d2aa2...

@michaelwoerister

Compress most of spans to 32 bits As described in https://internals.rust-lang.org/t/rfc-compiler-refactoring-spans/1357/28 Closes #15594 r? @michaelwoerister

Mark-Simulacrum · 2017-09-22T01:05:55Z

http://perf.rust-lang.org/compare.html?commit_a=17f56c549c35bb2cb316e5abff116e65277c7bb1&commit_b=360ae5f72680a37f487da2cde21947ec7f16d99b&stat=instructions%3Au

petrochenkov · 2017-09-22T09:32:09Z

@michaelwoerister
Updated table

setup	max-rss (average)	cpu-clock (average)
32-bit span, 2-bit tag	-0.73%	+0.16%
32-bit span, 1-bit tag	-0.70%	-1.05%
64-bit span, 1-bit tag	-0.66%	-0.77%

"32-bit span, 1-bit tag" still seems to be better

michaelwoerister · 2017-09-22T10:15:04Z

Alright, thanks for checking! 32 bits with 1 bit tag seems to be a good choice indeed.

Maybe encode() and decode() should be marked as #[inline] just so their body is available in downstream crates.

If you make the interner use an FxHashMap this is good to go :)

tamird · 2017-09-22T10:54:35Z

src/libsyntax_pos/span_encoding.rs

+// option. This file may not be copied, modified, or distributed
+// except according to those terms.
+
+// Spans are encoded using 2-bit tag and 4 different encoding formats for each tag.


this comment has rotted.

petrochenkov · 2017-09-22T21:42:52Z

@michaelwoerister
Updated.

michaelwoerister · 2017-09-25T08:25:24Z

@bors r+

Thanks, @petrochenkov! Looking forward to seeing the results :)

bors · 2017-09-25T08:25:25Z

📌 Commit 52251cd has been approved by michaelwoerister

bors · 2017-09-25T09:02:00Z

⌛ Testing commit 52251cd with merge dcb4378...

@michaelwoerister

Compress most of spans to 32 bits As described in https://internals.rust-lang.org/t/rfc-compiler-refactoring-spans/1357/28 Closes #15594 r? @michaelwoerister

bors · 2017-09-25T12:51:58Z

☀️ Test successful - status-appveyor, status-travis
Approved by: michaelwoerister
Pushing dcb4378 to master...

alexcrichton · 2017-09-27T04:05:04Z

Looks like this may have improved the memory of the tuple-stress benchmark by 5%!

@michaelwoerister

Optimize some span operations Do not decode span data twice/thrice/etc unnecessarily. Applied to stable hashing and all methods in `impl Span`. Follow up to rust-lang#44646 r? @michaelwoerister

Due the limitation that #[derive(...)] on #[repr(packed)] structs does not guarantee proper alignment of the compiler-generated impls is not guaranteed (rust-lang#39696), the change in rust-lang#44646 to compress Spans results in the compiler generating code with unaligned access. Until rust-lang#39696 has been fixed, the issue can be worked around by not using the packed attribute on sparc64 and sparcv9 on the Span struct. Fixes: rust-lang#45509

rust-highfive assigned michaelwoerister Sep 17, 2017

petrochenkov changed the title ~~Compress spans~~ Compress most of spans to 32 bits Sep 17, 2017

petrochenkov commented Sep 17, 2017

View reviewed changes

carols10cents added the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label Sep 18, 2017

petrochenkov force-pushed the scompress branch from a45b07a to f069c88 Compare September 18, 2017 23:54

bors added a commit that referenced this pull request Sep 18, 2017

Auto merge of #44646 - petrochenkov:scompress, r=<try>

1125280

Compress most of spans to 32 bits As described in https://internals.rust-lang.org/t/rfc-compiler-refactoring-spans/1357/28 Closes #15594 r? @michaelwoerister

petrochenkov force-pushed the scompress branch from e723bf6 to cb1158f Compare September 20, 2017 14:30

bors added a commit that referenced this pull request Sep 20, 2017

Auto merge of #44646 - petrochenkov:scompress, r=<try>

39d2aa2

Compress most of spans to 32 bits As described in https://internals.rust-lang.org/t/rfc-compiler-refactoring-spans/1357/28 Closes #15594 r? @michaelwoerister

tamird reviewed Sep 22, 2017

View reviewed changes

Compress "small" spans to 32 bits and intern "large" spans

52251cd

petrochenkov force-pushed the scompress branch from 74f4271 to 52251cd Compare September 22, 2017 21:41

petrochenkov added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Sep 23, 2017

bors merged commit 52251cd into rust-lang:master Sep 25, 2017

karcherm mentioned this pull request Oct 24, 2017

rustc broken on sparc64 #45509

Closed

petrochenkov mentioned this pull request Oct 28, 2017

Optimize some span operations #45602

Merged

glaubitz mentioned this pull request Nov 1, 2017

libsyntax_pos: Don't use packed attribute for Span on sparc64/v9 #45679

Closed

nzig mentioned this pull request Nov 3, 2017

Simplify Span by explicitly not supporting ctxt #45747

Closed

petrochenkov mentioned this pull request Jan 3, 2018

RFC: libsyntax2.0 rust-lang/rfcs#2256

Closed

petrochenkov mentioned this pull request Feb 14, 2019

Tweak Span encoding. #58458

Merged

petrochenkov mentioned this pull request Apr 4, 2019

Increase Span from 4 bytes to 8 bytes. #59693

Merged

petrochenkov deleted the scompress branch June 5, 2019 15:53

petrochenkov mentioned this pull request Aug 31, 2021

Encode spans relative to the enclosing item #84373

Merged

Compress most of spans to 32 bits #44646

Compress most of spans to 32 bits #44646

Uh oh!

Conversation

petrochenkov commented Sep 17, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

petrochenkov commented Sep 17, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

petrochenkov Sep 17, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

petrochenkov Sep 17, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

petrochenkov Sep 17, 2017

Choose a reason for hiding this comment

Uh oh!

arielb1 Sep 17, 2017

Choose a reason for hiding this comment

Uh oh!

arielb1 Sep 17, 2017

Choose a reason for hiding this comment

Uh oh!

michaelwoerister Sep 18, 2017

Choose a reason for hiding this comment

Uh oh!

arielb1 commented Sep 17, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

petrochenkov commented Sep 17, 2017

Uh oh!

Mark-Simulacrum commented Sep 17, 2017

Uh oh!

arielb1 commented Sep 17, 2017

Uh oh!

bors commented Sep 17, 2017

Uh oh!

michaelwoerister commented Sep 18, 2017

Uh oh!

petrochenkov commented Sep 18, 2017

Uh oh!

bors commented Sep 18, 2017

Uh oh!

bors commented Sep 19, 2017

Uh oh!

michaelwoerister commented Sep 19, 2017

Uh oh!

petrochenkov commented Sep 19, 2017

Uh oh!

Mark-Simulacrum commented Sep 19, 2017

Uh oh!

llogiq commented Sep 19, 2017

Uh oh!

petrochenkov commented Sep 20, 2017

Uh oh!

michaelwoerister commented Sep 20, 2017

Uh oh!

petrochenkov commented Sep 20, 2017

Uh oh!

bors commented Sep 20, 2017

Uh oh!

petrochenkov commented Sep 20, 2017

Uh oh!

bors commented Sep 20, 2017

Uh oh!

Mark-Simulacrum commented Sep 22, 2017

Uh oh!

petrochenkov commented Sep 22, 2017

Uh oh!

michaelwoerister commented Sep 22, 2017

Uh oh!

tamird Sep 22, 2017

Choose a reason for hiding this comment

Uh oh!

petrochenkov commented Sep 22, 2017

Uh oh!

petrochenkov commented Sep 17, 2017 •

edited

Loading

petrochenkov commented Sep 17, 2017 •

edited

Loading

petrochenkov Sep 17, 2017 •

edited

Loading

petrochenkov Sep 17, 2017 •

edited

Loading

arielb1 commented Sep 17, 2017 •

edited

Loading