Skip to content

Commit 5d2beda

Browse files
authored
Merge pull request #2386 from japaric/used
RFC: #[used] static variables
2 parents 352abc0 + a435f41 commit 5d2beda

File tree

1 file changed

+245
-0
lines changed

1 file changed

+245
-0
lines changed

text/2386-used.md

+245
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,245 @@
1+
- Feature Name: `used`
2+
- Start Date: 2018-04-03
3+
- RFC PR: [rust-lang/rfcs#2386](https://github.com/rust-lang/rfcs/pull/2386)
4+
- Rust Issue: [rust-lang/rust#40289](https://github.com/rust-lang/rust/issues/40289)
5+
6+
# Summary
7+
[summary]: #summary
8+
9+
Stabilize the `#[used]` attribute which is used to force the compiler to keep static variables,
10+
even if not referenced by any other part of the program, in the output object file.
11+
12+
# Motivation
13+
[motivation]: #motivation
14+
15+
Bare metal applications, like kernels, bootloaders and other firmware, usually need precise control
16+
over the memory layout of the program. These programs usually need to place data structures like
17+
vector (interrupt) tables in certain memory locations for the system to operate properly.
18+
19+
The final memory layout of the program is decided by the linker; bare metal applications make use of
20+
*linker scripts* to control the placement of (linker) *sections* in memory. But for all this to work
21+
the vector table must be present in the object files passed to the linker. That's where the
22+
`#[used]` attribute comes in: without it the compiler will optimize away the vector table, as it's
23+
not directly used by the program, and it will never reach the linker.
24+
25+
It's possible to work around the lack of the `#[used]` attribute by declaring the vector table as
26+
public:
27+
28+
``` rust
29+
// public items are exposed in the object file
30+
#[link_section = ".vector_table.exceptions"]
31+
pub static EXCEPTIONS: [extern "C" fn(); 14] = [/* .. */];
32+
```
33+
34+
But this is brittle because the compiler can still optimize the symbol away when compiling with LTO
35+
enabled -- with LTO the compiler has global knowledge about the program, and will see that
36+
`EXCEPTIONS` is unused by the program and discard it.
37+
38+
Yet another workaround is to force a volatile load of the vector table in some part of the program,
39+
usually before main. The compiler will always keep the vector table in this case but this
40+
alternative incurs in the cost of a load operation that will never be optimized away by the
41+
compiler.
42+
43+
``` rust
44+
#[link_section = ".vector_table.exceptions"]
45+
static EXCEPTIONS: [extern "C" fn(); 14] = [/* .. */];
46+
47+
// entry point of the firmware
48+
fn reset() -> ! {
49+
extern "C" {
50+
// user entry point
51+
fn main() -> !;
52+
}
53+
54+
// this operation will never be optimized away
55+
unsafe { ptr::read_volatile(&EXCEPTIONS[0]) };
56+
57+
main()
58+
}
59+
```
60+
61+
The proper solution to keeping the vector table is to mark the vector table as a *used* variable to
62+
force the compiler to keep in one of the emitted object files.
63+
64+
``` rust
65+
#[used] // will be present in the object file
66+
#[link_section = ".vector_table.exceptions"]
67+
static EXCEPTIONS: [extern "C" fn(); 14] = [/* .. */];
68+
```
69+
70+
# Guide-level explanation
71+
[guide-level-explanation]: #guide-level-explanation
72+
73+
We can think of the compilation process performed by `rustc` as a two stage process. First, `rustc`
74+
compiles a crate (source code) into *object files*, then `rustc` invokes the linker on those object
75+
files to produce a single *executable*, or shared library (e.g. `.so`) if the crate type was set to
76+
"cdylib".
77+
78+
The `#[used]` attribute can be applied to static variables to keep them in the *object files*
79+
produced by `rustc`, even in the presence of LTO. Note that this does **not** mean that the static
80+
variable will make its way into the binary file emitted by the linker as the linker is free to drop
81+
symbols that it deems unused. In other words, the `#[used]` attribute does **not** affect the
82+
behavior of the linker.
83+
84+
Consider the following program:
85+
86+
``` rust
87+
#[used]
88+
static FOO: u32 = 0;
89+
static BAR: u32 = 0;
90+
91+
fn main() {}
92+
```
93+
94+
The variable `FOO` marked with the `#[used]` attribute will be kept in the emitted object file
95+
regardless of the optimization level. On the other hand, the unused variable `BAR` is always
96+
optimized away.
97+
98+
``` console
99+
$ cargo rustc -- --emit=obj # for simplicity incr. comp. has been disabled
100+
$ nm -C $(find target -name '*.o')
101+
(..)
102+
0000000000000000 r foo::FOO
103+
0000000000000000 t foo::main
104+
0000000000000000 T std::rt::lang_start
105+
(..)
106+
107+
$ cargo clean; cargo rustc --release --
108+
$ nm -C $(find target -name '*.o')
109+
0000000000000000 T main
110+
0000000000000000 r foo::FOO
111+
0000000000000000 t foo::main
112+
0000000000000000 T std::rt::lang_start
113+
(..)
114+
115+
$ cargo clean; cargo rustc --release -- --emit=obj -C lto
116+
$ nm -C $(find target -name '*.o')
117+
(..)
118+
0000000000000000 r foo::FOO
119+
0000000000000000 t foo::main
120+
(..)
121+
```
122+
123+
`FOO` never makes it to the final executable because the linker sees that the call graph that stems
124+
from the user entry point `main` never makes use of `FOO` and discards it.
125+
126+
``` console
127+
$ cargo clean; cargo build
128+
$ nm -C target/debug/foo | grep FOO || echo not found
129+
not found
130+
```
131+
132+
To keep `FOO` in the final binary assistance from the linker is required; this usually means writing
133+
a linker script.
134+
135+
Consider the following program:
136+
137+
``` rust
138+
#[used]
139+
#[link_section = ".init_array"]
140+
static FOO: extern "C" fn() = before_main;
141+
142+
extern "C" fn before_main() {
143+
println!("Hello")
144+
}
145+
146+
fn main() {
147+
println!("World")
148+
}
149+
```
150+
151+
When dealing with ELF files the `.init_array` section will usually be kept in the final binary by
152+
the default linker script. If the system supports it, all function pointers stored in the
153+
`.init_array` section will be called before entering `main`. Thus, the above program prints "Hello"
154+
and then "World" to the console when run on a *nix system.
155+
156+
``` console
157+
$ cargo run --release
158+
Hello
159+
World
160+
161+
$ nm -C target/release/foo | grep FOO
162+
000000000026b620 t foo::FOO
163+
```
164+
165+
If the `#[used]` attribute is removed from the source code then only "World" is printed to the
166+
console as the `FOO` variable will get optimized away by the compiler.
167+
168+
# Reference-level explanation
169+
[reference-level-explanation]: #reference-level-explanation
170+
171+
The `#[used]` attribute can only be used on static variables. Static variables marked with this
172+
attribute will be appended to the special `@llvm.used` global variable when lowered to LLVM IR.
173+
174+
``` rust
175+
#[used]
176+
static FOO: u32 = 0;
177+
178+
fn main() {}
179+
```
180+
181+
``` console
182+
$ cargo clean; cargo rustc -- --emit=llvm-ir
183+
$ grep llvm.used $(find -name '*.ll')
184+
@llvm.used = appending global [1 x i8*] [i8* getelementptr inbounds (<{ [4 x i8] }>, <{ [4 x i8] }>* @_ZN3foo3FOO17hf0af6b03a826c578E, i32 0, i32 0, i32 0)], section "llvm.metadata"
185+
```
186+
187+
The semantics of this operation are (quoting [LLVM reference][llvm]):
188+
189+
[llvm]: https://llvm.org/docs/LangRef.html#the-llvm-used-global-variable
190+
191+
> If a symbol appears in the @llvm.used list, then the compiler, assembler, ~and linker~ are
192+
> required to treat the symbol as if there is a reference to the symbol that it cannot see (which is
193+
> why they have to be named). For example, if a variable has internal linkage and no references
194+
> other than that from the @llvm.used list, it cannot be deleted. This is commonly used to represent
195+
> references from inline asms and other things the compiler cannot “see”, and corresponds to
196+
> “attribute((used))” in GNU C.
197+
198+
*strikethrough added by the author*
199+
200+
The part about the linker is not true (\*): from the point of view of the linker static variables
201+
marked with `#[used]` look exactly the same as variables that have not been marked with that
202+
attribute -- those are the implemented LLVM semantics. Also ELF object files have no mechanism to
203+
prevent the linker from dropping its symbols if they are not referenced by other object files.
204+
205+
(\*) unless "linker" is actually referring to `llvm-link` (?)
206+
207+
# Drawbacks
208+
[drawbacks]: #drawbacks
209+
210+
This is yet another low level feature that alternative `rustc` implementations would have to
211+
implement to be 100% compatible with the official LLVM based `rustc` implementation. Also see
212+
`#[repr(align = "*")]`, `#[repr(*)]`, `#[link_section]`, etc.
213+
214+
# Rationale and alternatives
215+
[alternatives]: #alternatives
216+
217+
## Chosen design
218+
219+
This design pretty much matches how C compilers implement this feature. See prior art section below.
220+
221+
## Not doing this
222+
223+
Not doing this means that people will continue to use the brittle workarounds presented in the
224+
motivation section.
225+
226+
# Prior art
227+
[prior-art]: #prior-art
228+
229+
Most compilers provide a feature with the exact same semantics: usually in the form of a "used"
230+
attribute (e.g. `__attribute(used)__`) that can be applied to static variables.
231+
232+
The following C code is an example from the [KEIL toolchain documentation][keil]:
233+
234+
[keil]: http://www.keil.com/support/man/docs/armcc/armcc_chr1359124983230.htm
235+
236+
``` c
237+
static int lose_this = 1;
238+
static int keep_this __attribute__((used)) = 2; // retained in object file
239+
static int keep_this_too __attribute__((used)) = 3; // retained in object file
240+
```
241+
242+
# Unresolved questions
243+
[unresolved]: #unresolved-questions
244+
245+
None so far.

0 commit comments

Comments
 (0)