Skip to content

Commit ce6a2c8

Browse files
committed
Add section about building an optimized version of rustc
1 parent b02c792 commit ce6a2c8

File tree

2 files changed

+116
-0
lines changed

2 files changed

+116
-0
lines changed

src/SUMMARY.md

+1
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@
1414
- [Building Documentation](./building/compiler-documenting.md)
1515
- [Rustdoc overview](./rustdoc.md)
1616
- [Adding a new target](./building/new-target.md)
17+
- [Optimized build](./building/optimized-build.md)
1718
- [Testing the compiler](./tests/intro.md)
1819
- [Running tests](./tests/running.md)
1920
- [Testing with Docker](./tests/docker.md)

src/building/optimized-build.md

+115
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,115 @@
1+
# Optimized build of the compiler
2+
3+
<!-- toc -->
4+
5+
There are multiple additional build configuration options and techniques that can used to compile a build of `rustc`
6+
that is as optimized as possible (for example when building `rustc` for a Linux distribution). The status of these configuration
7+
options for various Rust targets is tracked [here]. This page describes how you can use these approaches when
8+
building `rustc` yourself.
9+
10+
[here]: https://github.com/rust-lang/rust/issues/103595
11+
12+
## Link-time optimization
13+
Link-time optimization is a powerful compiler technique that can increase program performance. To enable (Thin-)LTO when
14+
building `rustc`, set the `rust.lto` config option to `"thin"` in `config.toml`:
15+
16+
```toml
17+
[rust]
18+
lto = "thin"
19+
```
20+
21+
> Note that LTO for `rustc` is currently supported and tested only for the `x86_64-unknown-linux-gnu` target. Other
22+
> targets *may* work, but no guarantees are provided. Notably, LTO optimized `rustc` currently produces
23+
> [miscompilations] on Windows.
24+
25+
[miscompilations]: https://github.com/rust-lang/rust/issues/109114
26+
27+
Enabling LTO on Linux has [produced] speed-ups by up to 10%.
28+
29+
[produced]: https://github.com/rust-lang/rust/pull/101403#issuecomment-1288190019
30+
31+
## Memory allocator
32+
Using a different memory allocator for `rustc` can provide significant performance benefits. If you want to enable
33+
the `jemalloc` allocator, you can set the `rust.jemalloc` option to `true` in `config.toml`:
34+
35+
```toml
36+
[rust]
37+
jemalloc = true
38+
```
39+
40+
> Note that this option is currently only supported for Linux and macOS targets.
41+
42+
## Codegen units
43+
Reducing the amount of codegen units per `rustc` crate can produce a faster build of the compiler. You can modify the
44+
number of codegen units for `rustc` and `libstd` in `config.toml` with the following options:
45+
46+
```toml
47+
[rust]
48+
codegen-units = 1
49+
codegen-units-std = 1
50+
```
51+
52+
## Instruction set
53+
By default, `rustc` is compiled for a generic (and conservative) instruction set architecture (depending on the selected
54+
target), to make it support as many CPUs as possible. If you want to compile `rustc` for a specific instruction
55+
set architecture, you can set the `target_cpu` compiler option in `RUSTFLAGS`:
56+
57+
```bash
58+
$ RUSTFLAGS="-C target_cpu=x86-64-v3" x.py build ...
59+
```
60+
61+
If you also want to compile LLVM for a specific instruction set, you can set `llvm` flags in `config.toml`:
62+
63+
```toml
64+
[llvm]
65+
cxxflags = "-march=x86-64-v3"
66+
cflags = "-march=x86-64-v3"
67+
```
68+
69+
## Profile-guided optimization
70+
Applying profile-guided optimizations (or more generally, feedback-directed optimizations) can produce a large increase
71+
to `rustc` performance, by up to 25%. However, these techniques are not simply enabled by a configuration option,
72+
but rather they require a complex build workflow that compiles `rustc` multiple times and profiles it on selected benchmarks.
73+
74+
There is a tool called `opt-dist` that is used to optimize `rustc` with [PGO] (profile-guided optimizations) and [BOLT]
75+
(a post-link binary optimizer) for builds distributed to end users. You can examine the tool, which is located
76+
in `src/tools/opt-dist`, and build a custom PGO build workflow based on it, or try to use it directly. Note that the tool
77+
is currently quite hardcoded to the way we use it in Rust's continuous integration workflows, and it might require some
78+
custom changes to make it work in a different environment.
79+
80+
[PGO]: https://doc.rust-lang.org/rustc/profile-guided-optimization.html
81+
[BOLT]: https://github.com/llvm/llvm-project/blob/main/bolt/README.md
82+
83+
To use the tool, you will need to provide some external dependencies:
84+
- A Python3 interpreter (for executing `x.py`).
85+
- Compiled LLVM toolchain, with the `llvm-profdata` binary. Optionally, if you want to use BOLT, the `llvm-bolt` and
86+
`merge-fdata` binaries have to be available in the toolchain.
87+
- Downloaded [Rust benchmark suite]. (You can also let the tool download it itself, if you implement a custom environment,
88+
see below).
89+
90+
These dependencies are provided to `opt-dist` by an implementation of the [`Environment`] trait. You can either implement
91+
the trait for your custom environment, by providing paths to these dependencies in its methods, or reuse one of the existing
92+
implementations (currently, there is an implementation for Linux and Windows). If you want your environment to support
93+
BOLT, return `true` from the `supports_bolt` method.
94+
95+
Here is an example of how can `opt-dist` be used with the default Linux environment (it assumes that you execute the
96+
following commands on a Linux system):
97+
98+
1. Build the tool with the following command:
99+
```bash
100+
$ python3 x.py build tools/opt-dist
101+
```
102+
2. Run the tool with the `PGO_HOST` environment variable set to the 64-bit Linux target:
103+
```bash
104+
$ PGO_HOST=x86_64-unknown-linux-gnu ./build/host/stage0-tools-bin/opt-dist
105+
```
106+
Note that the default Linux environment expects several hardcoded paths to exist:
107+
- `/checkout` should contain a checkout of the Rust compiler repository that will be compiled.
108+
- `/rustroot` should contain the compiled LLVM toolchain (containing BOLT).
109+
- A Python 3 interpreter should be available under the `python3` binary.
110+
- `/tmp/rustc-perf` should contain a downloaded checkout of the Rust benchmark suite.
111+
112+
You can modify `LinuxEnvironment` (or implement your own) to override these paths.
113+
114+
[`Environment`]: https://github.com/rust-lang/rust/blob/65e468f9c259749c210b1ae8972bfe14781f72f1/src/tools/opt-dist/src/environment/mod.rs#L8-L7
115+
[Rust benchmark suite]: https://github.com/rust-lang/rustc-perf

0 commit comments

Comments
 (0)