codegen_llvm_back: improve allocations #55871

ljedrz · 2018-11-11T14:01:39Z

This commit was split out from #54864. Last time it was causing an LLVM OOM, which was most probably caused by not collecting the globals.

preallocate vectors of known length
extend instead of append where the argument is consumable
turn 2 push loops into extends
create a vector from a function producing one instead of using extend_from_slice on it
consume modules when no longer needed
~~return an impl Iterator from generate_lto_work~~
~~don't collect globals, as they are iterated over and consumed right afterwards~~

While I'm hoping it won't cause an OOM anymore, I would still consider this a "high-risk" PR and not roll it up.

rust-highfive · 2018-11-11T14:01:49Z

r? @nikomatsakis

(rust_highfive has picked a reviewer for you, use r? to override)

nagisa · 2018-11-11T16:13:25Z

@bors try

Lets do a perf run :)

bors · 2018-11-11T16:13:36Z

⌛ Trying commit dc1b2c7 with merge 684fb37...

codegen_llvm_back: improve allocations This commit was split out from #54864. Last time it was causing an LLVM OOM, presumably due to aggressive preallocation strategy in `thin_lto`. This time preallocations are more cautious and there are a few additional memory-related improvements (last 3 points from the list below). - _gently_ preallocate vectors of known length - `extend` instead of `append` where the argument is consumable - turn 2 `push` loops into `extend`s - create a vector from a function producing one instead of using `extend_from_slice` on it - consume `modules` when no longer needed - return an `impl Iterator` from `generate_lto_work` - don't `collect` `globals`, as they are iterated over and consumed right afterwards While I'm hoping it won't cause an OOM anymore, I would still consider this a "high-risk" PR and not roll it up.

bors · 2018-11-11T18:36:01Z

☀️ Test successful - status-travis
State: approved= try=True

nagisa · 2018-11-11T19:23:32Z

@rust-timer build 684fb37

rust-timer · 2018-11-11T19:23:33Z

Success: Queued 684fb37 with parent b76ee83, comparison URL.

rust-timer · 2018-11-11T21:18:06Z

Finished benchmarking try commit 684fb37

nagisa · 2018-11-11T22:42:18Z

It seems that both instruction counts and max-rss have suffered a fair hit, while for most benchmarks the minimum rss has also dropped significantly. Essentially this means we have increased deviation, without a clear win in mean rss.

ljedrz · 2018-11-12T09:28:37Z

The reds from instructions were in the high-variance benchmarks, so they could be just noise. The one that is not marked with a ? and had a significant change seems to be style-servo-opt (which is a huge piece of benchmark) that went down by 2.5% in the clean build. As for max-rss, true - the variance has increased; there are some weird results.

Since there do seem to be possible wins with some of these changes (I'd like to get those minimums from max-rss), I have a suggestion - I have removed the parts of the commit that potentially reserve high capacities (those were also the risky ones in terms of LLVM OOM) and we could rerun perf. The queue is empty anyway :).

nagisa · 2018-11-12T11:02:12Z

@bors try

bors · 2018-11-12T11:02:24Z

⌛ Trying commit 9e8bafc24b60008353f5f4e8027379a10bc7bb35 with merge f7360e5b2e5b2ed5e1696af257b365e3cc69981d...

bors · 2018-11-12T13:22:36Z

☀️ Test successful - status-travis
State: approved= try=True

nagisa · 2018-11-12T13:54:58Z

@rust-timer build f7360e5b2e5b2ed5e1696af257b365e3cc69981d

rust-timer · 2018-11-12T13:55:00Z

Success: Queued f7360e5b2e5b2ed5e1696af257b365e3cc69981d with parent 0195812, comparison URL.

rust-timer · 2018-11-12T15:48:27Z

Finished benchmarking try commit f7360e5b2e5b2ed5e1696af257b365e3cc69981d

ljedrz · 2018-11-12T17:22:49Z

Since the changes at this point are all pretty harmless I'd say that the benchmark results are statistical noise.

That being said, these changes don't seem to be beneficial performance-wise, so I'm ok with closing the PR, unless you believe that they are a readability improvement / more idiomatic.

nagisa · 2018-11-12T20:54:02Z

Uhh, no matter how I look at it, the max-rss results still seem like a hit-or-miss.

nikic · 2018-11-12T21:00:29Z

@nagisa max-rss has very high variance. Here's how it looks for a random recent commit: http://perf.rust-lang.org/compare.html?start=ca79ecd6940e30d4b2466bf378632efcdf5745c7&end=775eab58835f9bc0f1f01ccbb79725dce0c73b51&stat=max-rss The results for this PR seem to be within the "usual" noise.

nagisa · 2018-11-12T21:05:17Z

@bors r+

bors · 2018-11-12T21:05:18Z

📌 Commit 9e8bafc24b60008353f5f4e8027379a10bc7bb35 has been approved by nagisa

pietroalbini · 2018-11-16T09:06:35Z

@bors r-

Still OOMing on AppVeyor.

nagisa · 2018-11-16T10:07:07Z

Could it be that with this patch we overcommit virtual memory that ends up being never used? On Windows overcommit is not possible which would explain oom there but no observable regressions on perf runs?

…

On Fri, Nov 16, 2018, 11:07 Pietro Albini ***@***.*** wrote: @bors <https://github.com/bors> r- Still OOMing on AppVeyor. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#55871 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AApc0jXIny6RIi09_7OWVgMoCEYUM2Lcks5uvoA6gaJpZM4YYf06> .

ljedrz · 2018-11-16T10:10:13Z

Maybe, but how? These changes shouldn't negatively impact allocations - IMO they should make them easier to optimize.

nagisa · 2018-11-16T11:33:28Z

IMO they should make them easier to optimize.

The optimizability is irrelevant if it is indeed what I think it is. And is hardly related to the number of allocations, but rather to the size of them. Alas, rust perf does not collect the information of interest to tell for sure.

It would be interesting to do a perf run (and they really should be run that way all the time) with overcommit disabled.

ljedrz · 2018-11-16T11:41:56Z

hardly related to the number of allocations

As long as the length of the iterator is known, changing a push loop to an extend reduces the number of allocations to just one.

bors · 2018-11-17T07:11:13Z

☔ The latest upstream changes (presumably #55627) made this pull request unmergeable. Please resolve the merge conflicts.

nagisa · 2018-11-17T12:57:56Z

I’m not sure we’re talking about the same thing. Even though the following snippet would OOM on windows, it would work just fine on UNIXes due to overcommit:

auto x = new uint8_t[2 << 42]; // 4TiB or thereabouts…
x[0] = 42;

This has nothing to do with known length or allocation count.

@rust-lang/infra is it possible to make @rust-timer to collect additional information (e.g. max-virtual-mem, in addition to max-rss…)? What repository should I fill an issue to request this?

While something like this happening is clearly a bug somewhere (and I cannot tell where exactly), it is fairly obvious some change from those in the commit are making LLVM to commit too much memory that likely ends up never being used.

One thing to debug this you could do is to compile stage1 core on your own UNIX machine and see what the maximum virtual memory ends up being. If it ends up being significantly larger than RSS, that would confirm my suspicions.

Another thing you could do is disabling overcommit (if you’re on Linux, that’s sysctl vm.overcommit=2).

Finally, we could also just bisect – there aren’t that many different changes in this PR. We could try landing them one by one (though there’s a danger that all these changes are cumulatively slightly raising the committed memory and none of them would fail CI on their own).

pietroalbini · 2018-11-17T13:02:47Z

is it possible to make @rust-timer to collect additional information (e.g. max-virtual-mem, in addition to max-rss…)? What repository should I fill an issue to request this?

https://github.com/rust-lang-nursery/rustc-perf

ljedrz · 2018-11-17T14:44:33Z

@nagisa ah, ok, thanks for the explanation; I wasn't thinking in terms of a possible memory bug.

I will do some test builds on a Linux machine when I have a bit of free time.

ljedrz · 2018-11-20T10:27:34Z

@nagisa I ran ./x.py build src/libcore --stage 1 with and without the changes applied (doing a clean) in between and monitored the VIRT column in htop, but there were no significant differences.

I tried to disable overcommit like you described too, but this command doesn't work in the setup I used (Manjaro Linux).

nikomatsakis · 2018-11-26T20:25:49Z

What are the current thoughts here? I'm sort of inclined to close this PR as "not worth the trouble", but do you all still want to poke at it? Can I assign the review to someone else (@nagisa?).

nikic · 2018-11-26T20:44:53Z

src/librustc_codegen_llvm/back/write.rs

-            })
-            .collect::<Vec<_>>();
+            });
+


You're iterating over globals here and then adding new globals in the loop below. With the collect that's fine, as you'll only iterate existing globals. Without the collect this is going to be an infinite loop and you OOM.

Weird that this doesn't seem to cause issues on architectures different than i686 (at least that's the one where the OOM was being hit on appveyor); I can compile with these changes on x64 without issues.

ljedrz · 2018-11-27T08:10:22Z

@nikomatsakis At this point I was interested more in the possibility of uncovering some memory-related bug (as described by @nagisa). I think @nikic is onto something - it might not be a bug but a peculiarity of unsafe code, where an iterator can be extended while being iterated over.

ljedrz · 2018-12-03T09:38:54Z

@nikomatsakis I just remembered that the initial version of this PR included a change that had a -2.5% win for style-servo-opt which is considerable, especially since it is a huge benchmark; I rebased, re-included the win for servo and removed the problematic bit that @nikic marked as the one causing the OOM, so hopefully now it's good to go 🤞.

nagisa · 2018-12-03T10:11:48Z

Let's see if bors can be controlled over mail @bors r+

…

On Mon, Dec 3, 2018, 11:39 ljedrz ***@***.*** wrote: @nikomatsakis <https://github.com/nikomatsakis> I just remembered that the initial version of this PR included a change that had a -2.5% win for style-servo-opt <https://perf.rust-lang.org/compare.html?start=b76ee83254ec0398da554f25c2168d917ba60f1c&end=684fb37161c61f3e817fb2dcf81cd7c9097f3936> which is considerable, especially since it is a huge benchmark; I rebased, re-included the win for servo and removed the problematic bit that @nikic <https://github.com/nikic> marked as the one causing the OOM, so hopefully now it's good to go 🤞. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#55871 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AApc0lvHJjIT9JVeBCjiPkaEmibS8ON6ks5u1PFkgaJpZM4YYf06> .

bors · 2018-12-03T10:11:49Z

📌 Commit 2043d30 has been approved by nagisa

bors · 2018-12-04T11:47:15Z

⌛ Testing commit 2043d30 with merge 431e0ab...

codegen_llvm_back: improve allocations This commit was split out from #54864. Last time it was causing an LLVM OOM, which was most probably caused by not collecting the globals. - preallocate vectors of known length - `extend` instead of `append` where the argument is consumable - turn 2 `push` loops into `extend`s - create a vector from a function producing one instead of using `extend_from_slice` on it - consume `modules` when no longer needed - ~~return an `impl Iterator` from `generate_lto_work`~~ - ~~don't `collect` `globals`, as they are iterated over and consumed right afterwards~~ While I'm hoping it won't cause an OOM anymore, I would still consider this a "high-risk" PR and not roll it up.

bors · 2018-12-04T14:16:15Z

☀️ Test successful - status-appveyor, status-travis
Approved by: nagisa
Pushing 431e0ab to master...

rust-highfive assigned nikomatsakis Nov 11, 2018

rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Nov 11, 2018

ljedrz force-pushed the llvm_back_allocations branch from dc1b2c7 to 9e8bafc Compare November 12, 2018 09:28

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Nov 12, 2018

pietroalbini mentioned this pull request Nov 15, 2018

Rollup of 15 pull requests #55990

Closed

bors added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Nov 16, 2018

This comment has been minimized.

Sign in to view

nikic reviewed Nov 26, 2018

View reviewed changes

ljedrz force-pushed the llvm_back_allocations branch from 9e8bafc to ce4bce1 Compare December 3, 2018 09:21

codegen_llvm_back: improve allocations

2043d30

ljedrz force-pushed the llvm_back_allocations branch from ce4bce1 to 2043d30 Compare December 3, 2018 09:34

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Dec 3, 2018

bors merged commit 2043d30 into rust-lang:master Dec 4, 2018

ljedrz deleted the llvm_back_allocations branch December 4, 2018 14:20

codegen_llvm_back: improve allocations #55871

codegen_llvm_back: improve allocations #55871

Conversation

ljedrz commented Nov 11, 2018 • edited Loading

rust-highfive commented Nov 11, 2018

nagisa commented Nov 11, 2018

bors commented Nov 11, 2018

bors commented Nov 11, 2018

nagisa commented Nov 11, 2018

rust-timer commented Nov 11, 2018

rust-timer commented Nov 11, 2018

nagisa commented Nov 11, 2018

ljedrz commented Nov 12, 2018 • edited Loading

nagisa commented Nov 12, 2018

bors commented Nov 12, 2018

bors commented Nov 12, 2018

nagisa commented Nov 12, 2018

rust-timer commented Nov 12, 2018

rust-timer commented Nov 12, 2018

ljedrz commented Nov 12, 2018

nagisa commented Nov 12, 2018

nikic commented Nov 12, 2018

nagisa commented Nov 12, 2018

bors commented Nov 12, 2018

pietroalbini commented Nov 16, 2018

nagisa commented Nov 16, 2018 via email

ljedrz commented Nov 16, 2018

nagisa commented Nov 16, 2018

ljedrz commented Nov 16, 2018

bors commented Nov 17, 2018

nagisa commented Nov 17, 2018

pietroalbini commented Nov 17, 2018

This comment has been minimized.

ljedrz commented Nov 17, 2018

ljedrz commented Nov 20, 2018

nikomatsakis commented Nov 26, 2018

nikic Nov 26, 2018

Choose a reason for hiding this comment

ljedrz Nov 27, 2018 • edited Loading

Choose a reason for hiding this comment

ljedrz commented Nov 27, 2018 • edited Loading

ljedrz commented Dec 3, 2018

nagisa commented Dec 3, 2018 via email

bors commented Dec 3, 2018

bors commented Dec 4, 2018

bors commented Dec 4, 2018

ljedrz commented Nov 11, 2018 •

edited

Loading

ljedrz commented Nov 12, 2018 •

edited

Loading

ljedrz Nov 27, 2018 •

edited

Loading

ljedrz commented Nov 27, 2018 •

edited

Loading