-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to explain linker symbols used as integers (and not pointers to an allocation)? #554
Comments
Wow those are terrifying hacks... The current state is quite simple: non-zero-sized statics must point to an allocation; violating this is UB. I don't know how much of this we are at liberty to change -- does LLVM even permit statics that are not backed by actual memory? Cc @nikic |
LLVM requires globals to be dereferenceable up to at least their size, so if you have a u8 static it should also be dereferenceable for one byte. If it's a ZST static, then it doesn't of course. Of course, from a practical perspective, LLVM will never materialize a load from a global out of thin air, only hoist it out of control flow. But I don't think it's possible to specify that operationally :) |
I'm assuming that this conclusion assumes current Rust and LLVM, where there is only one notion of static/global. In a world where there are 2 notions of statics/globals (one you can always deref and one you can never deref), this seems easy to specify because there's no "maybe can deref". That's essentially the
Note that those use-cases are not specific to Rust, so I guess most people assume stronger guarantees from LLVM (namely that it doesn't assume globals to be dereferencable unless the program does it, which is not a clean specification as you mentioned). |
Either way this means doing anything on the Rust side here is blocked on having LLVM support for some sort of non-dereferenceable global.
|
Is there a way to explain linker symbols used only for their address? In particular when this address is not meant as a pointer but as an integer only. I have 2 examples.
Example 1: Using a linker symbol to communicate a number
The
riscv-rt
crate uses a_heap_size
symbol to let users configure the heap size through the linker script. In particular, they have a code that looks like this in their documentation:We could argue whether
u8
is the correct type. Let's assume it's a ZST to simplify this particular case.Example 2: Using a linker symbol to "allocate" a unique (to the program) value
The
defmt
crate uses static variables in specific linker section (with specific name, but this is orthogonal to this issue) to allocate identifiers for interned strings. The address of the static variable (aka the value of the symbol) is the identifier. The static variable does not represent a proper allocation, and actually won't have any allocation at all from Rust point of view (the linker section isNOLOAD
).The proc-macro generating those static variables looks like this:
In this case, the
u8
type matters. That's how the identifiers are unique and consecutive (they start at 1 so they can fit in au16
).Related issues
I'm creating a new issue, although there are many related issues, because I feel this particular concern of static variables without allocation is not addressed yet. Here are the related issues:
Please dedup if I missed an issue or I'm wrong in my analysis.
Theoretical suggestion
There could be an attribute to indicate when a static does not have an allocation. It is thus UB for a static to not have an allocation if it does not have this attribute. Such "inaccessible statics" can only have their address taken. They don't have an allocation and can't be dereferenced.
The text was updated successfully, but these errors were encountered: