-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Binary size blowup presumably because of allocator integration #42808
Comments
The blowup of binary size is the minor difference. The main difference is in compilation times and run-times of that resulting binary, that are quite worse. |
@alexbool can you enable us to reproduce this? I can investigate in parallel but bug reports tend to be much more readily fixable if there's a way to reproduce what caused the bug report! |
Looks like this isn't a regression for "hello world", or likely just "in the noise"
|
I'll try to make a minimal reproducible example today or later this week |
@alexbool any updates on this? @leonardo-m or @hinaria do y'all have examples of this causing a size increase? |
Unfortunately didn't have much time to dig into this. Will try as soon as I have possibility. |
Here is a test case that may serve for exhibiting the space increase that @alexbool observed around this time. (Or at least, on my Mac, building without optimizations, I am observing a size increase of about 1.08x on the stripped binaries. With optimizations, the size increase is about 1.05x) // vecel.rs
use std::env;
fn main() {
let mut args = env::args();
args.next();
let arg1 = args.next().unwrap();
let num: usize = arg1.parse().unwrap();
let mut v = Vec::new();
for i in 0..num {
v.push(i);
}
assert_eq!(v.len(), num);
} Build script:
Results:
|
My previous comment notwithstanding, it would be really great if @alexbool could provide us either with their original test case or a somewhat reduced version of it, since it is entirely possible that my (strawman) object size microbenchmark does not actually reflect the real issues that are underpinning the size increase that @alexbool is observing. |
@alexcrichton, a little increase in binary size is not important to me. But I am seeing significant increases in the run-time of the binary. I've filed a different issue but the cause could be the same: |
This pathological case is producing ~8.5% increase in release mode between 2017-06-19 and 2017-06-23: #[macro_use]
extern crate serde;
extern crate serde_yaml;
use std::fs::OpenOptions;
#[derive(Debug, Deserialize)]
pub struct Something {
field1: Vec<S1>,
field2: Vec<S2>,
field3: Vec<S3>,
field4: Vec<S4>,
field5: Vec<S5>,
field6: Vec<S6>,
field7: Vec<S7>,
field8: Vec<S8>,
field9: Vec<S9>,
}
#[derive(Debug, Deserialize)]
pub struct S1(String);
#[derive(Debug, Deserialize)]
pub struct S2(String);
#[derive(Debug, Deserialize)]
pub struct S3(String);
#[derive(Debug, Deserialize)]
pub struct S4(String);
#[derive(Debug, Deserialize)]
pub struct S5(String);
#[derive(Debug, Deserialize)]
pub struct S6(String);
#[derive(Debug, Deserialize)]
pub struct S7(String);
#[derive(Debug, Deserialize)]
pub struct S8(String);
#[derive(Debug, Deserialize)]
pub struct S9(String);
fn main() {
println!(
"{:?}",
serde_yaml::from_reader::<_, Something>(OpenOptions::new().open("whatever").unwrap())
);
}
[package]
name = "issue-42808"
version = "0.1.0"
authors = ["Alexander Bulaev <alexbool@yandex-team.ru>"]
[dependencies]
serde = { version = "=1.0.8", features = ["derive"] }
serde_yaml = "=0.7.1" |
I have a datapoint to provide. For ease of experimentation, I made a small variation on alexbool's benchmark that avoids pulling in serde. (Basically, my goal was to have a single .rs file that could be independently compiled without any crate dependencies.) I think my benchmark is likely to get at the heart of what is problematic here (namely that the code-duplication from monomorphization of generics can amplify the effect of otherwise minor code size regressions). Manually inspecting the output assembly and comparing what we generated before and after the Allocator API landed, I saw cases where we could/should be inlining more. So I tried comparing against the branch that is basis for PR #42727, which adds some Another thing I noticed was that we seem to be spending a number of instructions just shuffling words around. It wasn't immediately obvious what all the causes were, but one clear candidate was the
So this shows that some of the code size regression can perhaps be blamed on the change to alloc's return type; it used to be Update: I attempted to recreate my above results and discovered that I posted my original comment without finishing a crucial sentence: the 311,296 figure was gathered from a build that both made |
Thanks for the investigation and minimization @pnkfelix! I'm still pretty surprised about the increase in code there! I think it may be worth digging in a bit more to see where this "code bloat" is coming from. I would suspect that these are legitimately new locations for more |
One thing that I keep seeing in the diffs regardless of how I attempt to "correct" for whatever changes that Building my benchmark just before the |
Hm ok so now I'm getting interesting results again: Using
However removing
Can this still be reproduced on the latest nightly? |
Just a quick note for anyone trying to figure out how the numbers have been jumping up and down (at least, I was initially flummoxed when trying to compare @alexcrichton 's results with the numbers I provided 12 days ago) The original bug report was comparing
In case its hard to see, here's the important sequence of numbers:
So, there was an observable regression (and it still seems to be present), but it is potentially masked by other improvements to code size that happened between May 31st and June 18th. |
Oh I was actually just picking random dates, looks like I "got lucky"! |
Also, another factor that is easy to overlook is that in the earlier comments from both @alexcrichton and myself: our first invocations of |
In an effort to try to understand the size trajectories over time, I generalized the benchmark so that one can choose whether one wants to compile it with 1, 2, 3, 11, or 21 distinct variations on the My intention was to differentiate between a couple different things:
Here is a gist with the program, a shell script to drive it, and two sets of output transcripts (one that uses https://gist.github.com/pnkfelix/fe632a4bad0759c52e0001efde83e038 Here are some observations based on looking both the development of the numbers over time (i.e. scanning down the columns)
|
Thanks for the scripts and investigation @pnkfelix! Lots of good data.
This range of commits notably includes #42566 which probably just means that jemalloc 4.0.0 is a smaller binary than jemalloc 4.5.0 I tweaked the script/source file to account for this and always link to the system allocator and got these results which notably shows the same point of a regression you saw, and that the addition of |
cc @rust-lang/compiler, with @pnkfelix away we're going to need some help tackling this one. |
Thanks for the analysis @arielb1! It sounds to me though like there's not necessarily a huge amount to do here. The jemalloc bloat isn't something that should be a problem (small projects turn it off anyway) and the dealloc-related "unwind bloat" is more of a correctness fix right now than anything else. Also if projects care about binary size they're probably compiling with That just leaves 8k of unexplained size along with 3k of OOM-related stuff, both of which seem quite reasonable and ripe for future optimization. In that sense with a lack of clear actionable bug, is it time to close this? |
In which cases can |
Seeing how no global allocators right now can unwind (system/jemalloc) I've proposed #44049 to fix the regression. I'll open a further issue for more discussion here related to unwinding allocators. |
This commit flags all allocation-related functions in liballoc as "this can't unwind" which should largely resolve the size-related issues found on rust-lang#42808. The documentation on the trait was updated with such a restriction (they can't panic) as well as some other words about the relative instability about implementing a bullet-proof allocator. Closes rust-lang#42808
This is fixed on beta, removing from milestone. |
This commit flags all allocation-related functions in liballoc as "this can't unwind" which should largely resolve the size-related issues found on rust-lang#42808. The documentation on the trait was updated with such a restriction (they can't panic) as well as some other words about the relative instability about implementing a bullet-proof allocator. Closes rust-lang#42808
std: Mark allocation functions as nounwind This commit flags all allocation-related functions in liballoc as "this can't unwind" which should largely resolve the size-related issues found on #42808. The documentation on the trait was updated with such a restriction (they can't panic) as well as some other words about the relative instability about implementing a bullet-proof allocator. Closes #42808
This commit flags all allocation-related functions in liballoc as "this can't unwind" which should largely resolve the size-related issues found on rust-lang#42808. The documentation on the trait was updated with such a restriction (they can't panic) as well as some other words about the relative instability about implementing a bullet-proof allocator. Closes rust-lang#42808
Between
rustc 1.19.0-nightly (10d7cb44c 2017-06-18)
and1.20.0-nightly (445077963 2017-06-20)
I experienced one of my binaries to grow up from 3.4M to 3.7M, which is approx 8.8%. At first I was scared that this was a fallout of my own PR (#42716), but that particular binary didn't use C strings almost at all.I examined a diff in symbols made with
nm
and saw that the binary compiled with1.20.0-nightly (445077963 2017-06-20)
has a lot more symbols like that:_<alloc::raw_vec::RawVec<T, A>>::dealloc_buffer
(14 occurences)_alloc::allocator::Alloc::alloc_array
(57 occurences)_alloc::allocator::Alloc::realloc_array
(28 occurences)_core::ptr::drop_in_place
(mighty 138 new occurences)This leads us to the possibility that #42313 is the culprit.
Current Situation
There are 2 reproducers, and I need to make some measurements:
The text was updated successfully, but these errors were encountered: