Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use a more efficient encoding for opaque data in RBML. #30565

Merged
merged 1 commit into from
Dec 31, 2015

Conversation

michaelwoerister
Copy link
Member

This PR changes the emit_opaque and read_opaque methods in the RBML library to use a space-efficient binary encoder that does not emit any tags and uses the LEB128 variable-length integer format for all numbers it emits.

The space savings are nice, albeit a bit underwhelming, especially for dynamic libraries where metadata is already compressed.

RLIBs NEW OLD
libstd 8.8 MB 10.5 MB
libcore 15.6 MB 19.7 MB
libcollections 3.7 MB 4.8 MB
librustc 34.0 MB 37.8 MB
libsyntax 28.3 MB 32.1 MB
SOs NEW OLD
libstd 4.8 MB 5.1 MB
librustc 8.6 MB 9.2 MB
libsyntax 7.8 MB 8.4 MB

At least this should make up for the size increase caused recently by also storing MIR in crate metadata.

Can this be a breaking change for anyone?
cc @rust-lang/compiler

@rust-highfive
Copy link
Collaborator

r? @nikomatsakis

(rust_highfive has picked a reviewer for you, use r? to override)

@michaelwoerister
Copy link
Member Author

and uses the LEB128 variable-length integer format for all numbers it emits

Which it by the way shouldn't do for 1-byte and floating point numbers... I'll change that in a separate PR some time.

@michaelwoerister
Copy link
Member Author

cc @alexcrichton @brson who might be interested in things concerning the binary size of our libraries.

@eddyb
Copy link
Member

eddyb commented Dec 26, 2015

@arielb1 Worked on metadata, and I remember something about switching to a fixed-sized integer model (for decoding speed? not sure).

@michaelwoerister
Copy link
Member Author

@eddyb I did some non-rigorous performance comparisons, compiling libsyntax a few times with and without the changes in this PR. I found no observable difference between the two. That being said, once the support for an alternate encoding is in place, it would be easy to change how numbers and other primitive types are emitted and we could take a look if fixed size integer would make things faster (although I doubt that it makes much of a difference for overall compilation time).

@arielb1
Copy link
Contributor

arielb1 commented Dec 27, 2015

I don't see any problems with the idea itself.

@arielb1
Copy link
Contributor

arielb1 commented Dec 27, 2015

We probably don't need 2 separate vuint encoding schemes (rbml's vuint and yours)

@arielb1
Copy link
Contributor

arielb1 commented Dec 27, 2015

@michaelwoerister

Incidentally, the primary worry is not the cost of writing metadata but that of reading it - which is already 2% of compilation time with the old code according to some old measurements I have made. For that, librustc_driver is perhaps the gold example of a small crate that inlines lots of stuff.

@bors
Copy link
Contributor

bors commented Dec 28, 2015

☔ The latest upstream changes (presumably #30582) made this pull request unmergeable. Please resolve the merge conflicts.

@michaelwoerister
Copy link
Member Author

We probably don't need 2 separate vuint encoding schemes (rbml's vuint and yours)

In principle I agree. RBML vuint can only handle unsigned values though and I don't know if we want to break away from EBML's prescribed encodings for tags and sizes.

@michaelwoerister
Copy link
Member Author

Here are some timings from compiling librustc_driver:

current implementation:

216.59 user 
0.61 system 
3:37.89 elapsed 

221.10 user
0.91 system
3:42.65 elapsed

229.48 user
1.22 system
3:51.36 elapsed


with LEB128 integers:

228.41 user
2.62 system
3:52.28 elapsed

226.58 user
1.36 system
3:48.60 elapsed

219.04 user
2.55 system
3:42.24 elapsed


with fixed-sized integers

243.72 user
0.80 system
4:05.25 elapsed

241.35 user
0.76 system
4:02.82 elapsed

216.21 user
0.76 system
3:37.57 elapsed

225.14 user
0.64 system
3:46.44 elapsed

228.50 user
0.64 system
3:49.82 elapsed

I don't see much of a difference between any of those.

@michaelwoerister
Copy link
Member Author

... rebased

@brson
Copy link
Contributor

brson commented Dec 28, 2015

Looks good.

@brson
Copy link
Contributor

brson commented Dec 29, 2015

@bors r+

@bors
Copy link
Contributor

bors commented Dec 29, 2015

📌 Commit fa2a741 has been approved by brson

@bors
Copy link
Contributor

bors commented Dec 30, 2015

⌛ Testing commit fa2a741 with merge 23a80cc...

@bors
Copy link
Contributor

bors commented Dec 30, 2015

👀 Test was successful, but fast-forwarding failed: 422 Reference already exists

nagisa added a commit to nagisa/rust that referenced this pull request Dec 31, 2015
…brson

This PR changes the `emit_opaque` and `read_opaque` methods in the RBML library to use a space-efficient binary encoder that does not emit any tags and uses the LEB128 variable-length integer format for all numbers it emits.

The space savings are nice, albeit a bit underwhelming, especially for dynamic libraries where metadata is already compressed.

| RLIBs        |  NEW   |   OLD     |
|--------------|--------|-----------|
|libstd        | 8.8 MB |  10.5 MB  |
|libcore       |15.6 MB |   19.7 MB |
|libcollections| 3.7 MB |    4.8 MB |
|librustc      |34.0 MB |   37.8 MB |
|libsyntax     |28.3 MB |   32.1 MB |

| SOs           |     NEW   |    OLD |
|---------------|-----------|--------|
| libstd        |  4.8 MB   | 5.1 MB |
| librustc      |  8.6 MB   | 9.2 MB |
| libsyntax     |  7.8 MB   | 8.4 MB |

At least this should make up for the size increase caused recently by also storing MIR in crate metadata.

Can this be a breaking change for anyone?
cc @rust-lang/compiler
bors added a commit that referenced this pull request Dec 31, 2015
@bors bors merged commit fa2a741 into rust-lang:master Dec 31, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants