Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Must static DATA: UnsafeCell = expr; still == expr by main()? #397

Open
workingjubilee opened this issue Mar 14, 2023 · 66 comments
Open

Must static DATA: UnsafeCell = expr; still == expr by main()? #397

workingjubilee opened this issue Mar 14, 2023 · 66 comments

Comments

@workingjubilee
Copy link
Member

workingjubilee commented Mar 14, 2023

"Okay I think we first and foremost have a Rust Abstract Machine / @rust-lang/opsem question here, not a Miri question:

If a mutable (or interior mutable) static is initialized in Rust to a certain value, then can the Rust AM assume that it does have that value when the program starts? Or is it okay to have some "before main setup" actually change the value of that static?

I would say for an immutable static, this is clearly UB -- using linker script tricks like what has been described here is just not allowed for those statics. But for statics that can be mutated, we already can't in general optimize assuming their values did not change, so... it seems reasonable to allow this?

With this way of thinking about the question, obviously Miri has no way of knowing that you are modifying the value of the static before Rust code starts running. On the Rust side you are saying this static is filled with uninit memory, but then you are making some outside-of-Rust assumptions to justify that actually when the program starts the static has a different value, in particular that the bytes now are initialized. So I don't think there is anything actionable on the Miri side here, and the discussion (with the clarified question) should move to https://github.com/rust-lang/unsafe-code-guidelines/ ."

Originally posted by @RalfJung in rust-lang/miri#2807 (comment)

@digama0
Copy link

digama0 commented Mar 14, 2023

I think that mutable statics still need to have the values they were promised to have in the initializer unless an AM-visible write occurred to them. That write might have been caused by FFI or asm!, but if the value isn't there at lang_start then it's not clear where the value goes at all, in which case a compiler which just ignores all static mut initializers would be conforming and that's not ok.

Regarding the linked thread, I think the appropriate model is to use an extern static and allocate/uninitialize it using a linker script, and then use as the AM invariant that when main() is called that static is initialized with a random bit pattern (not uninit). That's still not quite enough to validate the technique of signature validation since that's only probabilistically correct in the first place, but it does eliminate the need to do any volatile reads and writes and you won't get UB.

I don't see a way to make this work with a rust-allocated static without lying to the compiler to some extent. For example, we could use a mutable static (initialized to 0, say) with a link_section = ".uninit" on it, and then call an empty asm! block which has the postcondition of ensuring that the static is initialized to a random bit pattern, but in that case the AM will be in disagreement with the hardware between the start of main() and the asm! block, since the AM thinks the static is zero-initialized and the hardware static has actual data in it. (This is "fine" since the static will not be read or written to during this period.)

@Diggsey
Copy link

Diggsey commented Mar 15, 2023

What if you did this:

use std::arch::asm;

static mut FOO: i32 = 0;

pub fn main() {
    unsafe {
        asm!("/* {} */", in(reg) &mut FOO);
    }
    // Rest of program...
}

The ASM block does nothing but the compiler must behave as though it may have written a value into the static. I guess this is what you were suggesting @digama0 ?

@digama0
Copy link

digama0 commented Mar 15, 2023

Yes, that is what I meant. The drawback is that the compiler is technically allowed to read FOO before the asm! block and assume that the value in the static is 0 because no threads have started yet so things should still look like the initialization values. In practice this doesn't seem to be a likely issue, since there isn't much optimization reason to do so.

(Also, we should replace main() with lang_start() in this discussion; there is some std code which runs before main which is not special in this regard. This should be fine for the present discussion since it certainly doesn't touch FOO.)

@CAD97
Copy link

CAD97 commented Mar 15, 2023

It's not completely unreasonable to have a step between codegen where the initial starting parameters of the AM are serialized and the start entry point into AM execution where the host could be allowed to insert some set of AM state changes, which could include changing the value of static muts.

Notably, the host can expose some known memory locations' provenance, such that from_exposed_addr(MMIO_ADDRESS) can pick up valid provenance; I think that being valid is fairly non-controversial. If the startup state can include exposed addresses from the host in the AM ambient state, I find it more difficult to say why other host-writable AM state cannot be modified between codegen and startup.

But I still agree that the proper way to model the original use case here is almost certainly an extern static allocated by the host linker shenanigans rather than managed within the AM.

@bjorn3
Copy link
Member

bjorn3 commented Mar 15, 2023

I think for as long as DATA is exported (using eg #[no_mangle]), it should be possible to write to it before main. I don't think the main function should be special cased. There is no main function for cdylib's and for cdylib's I don't see any reason to not allow writing to DATA from the user of the cdylib before the first time calling an exported function. I think a binary should be treated just like a cdylib except always exporting the main shim which calls lang_start and the user main function. If on the other hand DATA is not exported outside of the rust crate, I think it is fair to require that it has the exact same value as the initializer upon the first call to an exported function and does not get changed outside of the control of functions in the rust crate. So for example if there are no functions that modify it, ot could be moved fo read-only memory.

@digama0
Copy link

digama0 commented Mar 15, 2023

I think for as long as DATA is exported (using eg #[no_mangle]), it should be possible to write to it before main. I don't think the main function should be special cased.

I don't think this is about special casing, so much as the issue that there is no "life before main", even in the AM. As far as I am aware the AM semantics start at lang_start(). So FFI which calls lang_start() is another level of subtle compared to regular FFI.

As I mentioned before, the issue with saying that the initialization value of a static doesn't need to be in that static when the AM takes its first step (wherever we decide to put that) is that it becomes impossible for the AM to distinguish between static mut FOO: u32 = 0; and static mut FOO: u32 = 1; as declarations - these are observably equivalent and a compiler could just ignore the initializers if it wanted to. This is obviously not what we want.

The way the analogy breaks down with cdylibs is that a call to a rust function from FFI is not the AM's first step. There is still some subtlety to define what exactly was happening in the AM before the first line of rust code, but I imagine that it would be something like "arbitrary RAM-legal operations, such that the state at the call satisfies the appropriate calling convention". Since it's an unsafe interface we can freely add additional safety constraints like "and FOO should have value 0" to control the state of the machine when the function is called; for lang_start it is unclear how we could do anything like that because it's a safe function.

@RalfJung
Copy link
Member

RalfJung commented Mar 15, 2023

I think that mutable statics still need to have the values they were promised to have in the initializer unless an AM-visible write occurred to them. That write might have been caused by FFI or asm!, but if the value isn't there at lang_start then it's not clear where the value goes at all, in which case a compiler which just ignores all static mut initializers would be conforming and that's not ok.

No that's not correct. Just because the user is allowed to change the value via arbitrary means before lang_start, doesn't mean the compiler is allowed to do that.

For the same reason your "observably equivalent" argument doesn't hold up. This issue is not about giving the compiler more choice, it is purely about giving the user more choice. @CAD97 expressed it pretty well I think, it's basically a user-decided machine step that happens before the lang_start stack frame is pushed.

@digama0
Copy link

digama0 commented Mar 15, 2023

For the same reason your "observably equivalent" argument doesn't hold up. This issue is not about giving the compiler more choice, it is purely about giving the user more choice. @CAD97 expressed it pretty well I think, it's basically a user-decided machine step that happens before the lang_start stack frame is pushed.

I mean, I'm fine with building the equivalent of an empty asm! block into the language semantics for the start of the machine, but that still leaves me confused as to how this actually affects the spec. What exactly is the compiler promising when we compile a regular rust program with a static mut and no linker shenanigans? How precisely are you going to prevent a compiler that ignores initializers from being counted as conforming?

A completely user-decided machine step at program start is also a really heavy hammer. If you aren't careful, you might make it impossible for the compiler to produce a working program under those conditions since the compiler doesn't know what the user intends there.

@RalfJung
Copy link
Member

I'm confused about why you are bringing up this question since it doesn't change with this issue at all? In the initial machine state (before that new user-defined step we are discussing) the static mut has the value given by the initializer.

But I still agree that the proper way to model the original use case here is almost certainly an extern static allocated by the host linker shenanigans rather than managed within the AM.

That seems to be hard to do without a C toolchain, or something? I don't claim to understand the usecase that triggered the discussion here.^^

@digama0
Copy link

digama0 commented Mar 15, 2023

I'm confused about why you are bringing up this question since it doesn't change with this issue at all? In the initial machine state (before that new user-defined step we are discussing) the static mut has the value given by the initializer.

This user-defined step probably needs quite some work to unpack, since it is basically what we have been calling "linker shenanigans". Somehow the spec has to interact with linker scripts, and I guess it is this code that would actually see the values of static initializers and have an opportunity to change them.

As it relates to the title question, I think the answer should still be "yes" provided there are no linker scripts (or only the default one). A program like:

static mut FOO: u8 = 0;
fn main() {
  println!("{}", unsafe { FOO });
}

with no special linking stuff should be required to print 0 (and the user should not be able to get this to print 1 by wishing on a star). Anything which changes the output of this program needs to involve changing an actual input to rustc (+ linker + dynamic linker).

@digama0
Copy link

digama0 commented Mar 15, 2023

But I still agree that the proper way to model the original use case here is almost certainly an extern static allocated by the host linker shenanigans rather than managed within the AM.

That seems to be hard to do without a C toolchain, or something? I don't claim to understand the usecase that triggered the discussion here.^^

If true, this seems like a language issue. It should be possible to declare "extern statics" with link_section using only rust and a linker script (not that I know much about the latter). I think the OP was talking about using C to allocate the static itself, which would be just as incorrect as it is in rust.

@bjorn3
Copy link
Member

bjorn3 commented Mar 15, 2023

A program like:

static mut FOO: u8 = 0;
fn main() {
  println!("{}", unsafe { FOO });
}

with no special linking stuff should be required to print 0 (and the user should not be able to get this to print 1 by wishing on a star).

That is satisfied by my suggestion in #397 (comment) to require that only exported statics can be mutated by external processes. If you were to add pub or #[no_mangle] I think it should be allowed to print something else, but without both I agree that it should only print 0.

@digama0
Copy link

digama0 commented Mar 15, 2023

Ah, sorry I should have made it pub. I am talking specifically about the case where it is public but nothing is linked in which could actually do any funny business. In such a case, external processes are allowed to mutate it, but this is not the same as them having already done so. The AM sees all, and unless such a process actually runs no mutation should happen.

@bjorn3
Copy link
Member

bjorn3 commented Mar 15, 2023

In case of pub for bin's specifically I can agree. But in case of #[no_mangle] you can't prove that the libc that is linked in the above program doesn't do dlsym to lookup FOO and mutate it.

@digama0
Copy link

digama0 commented Mar 15, 2023

I agree, but that's more of a compiler optimization perspective than what the AM actually does. The compiler doesn't know what the dynamic linker is doing either, so it can only do e.g. constant folding if it knows or has an agreement with all such code that the relevant statics will not be touched, but the AM can say more precisely "The program prints 0 as long as libc et al don't mutate FOO".

@bjorn3
Copy link
Member

bjorn3 commented Mar 15, 2023

but the AM can say more precisely "The program prints 0 as long as libc et al don't mutate FOO".

Sure, but in the same way we can also say a static is equal to the value in the initializer at the start of the program so long as no external process mutated it rather than a blanket ban on all mutation from external processes.

@digama0
Copy link

digama0 commented Mar 15, 2023

One thing about external mutations of a static mut is that it would be a data race under most circumstances, unless the program start is considered as synchronizing with said process (probably true for linkers and false for actually concurrent mutation). So depending on how the static is accessed in rust you might still be able to say some things about what can happen to the value without UB.

@bjorn3
Copy link
Member

bjorn3 commented Mar 15, 2023

In the specific use case that started this discussion the "mutation" synchronized with the program start. But I agree that it can be a data race in the general case.

@RalfJung
Copy link
Member

RalfJung commented Mar 18, 2023

with no special linking stuff should be required to print 0

Yes of course? I's a user-defined step that is added. It models the user doing linker shenanigans. If the user does nothing funny then it implicitly picks the identity step and then of course this prints 0.

I feel I must be missing something since this all seems obvious and it's the easy case. The case I was hoping to discuss here is the one where the step actually does something. The question is whether the compiler gets to assume that the initial state of the AM looks a certain way or not. The more control we want to give the user over this initial state, the less the compiler can assume. If we say nothing, then the initial state is fixed by the Rust program written (e.g. in MiniRust, this happens here), which gives the compiler the license to make assumptions about it; this discussion is about whether we are fine with that or whetver we want to add the infrastructure that is required to not have a single fixed initial state.

@digama0
Copy link

digama0 commented Mar 18, 2023

with no special linking stuff should be required to print 0

Yes of course? I's a user-defined step that is added. It models the user doing linker shenanigans. If the user does nothing funny then it implicitly picks the identity step and then of course this prints 0.

I'm saying the user should not be able to perform such a step without actually doing something that has an impact on the literal bytes going in to the compiler. That is, it can't be something that we just say occurs as a result of (the equivalent of) an empty asm block. When the user actually does linker shenanigans there will be a change to a linker script and this script is an input to the compiler (construed broadly).

I feel I must be missing something since this all seems obvious and it's the easy case.

The part that makes it not easy is that if we say that the user can do linker shenanigans with their mind then the compiler can't link programs correctly that have no indication that linker shenanigans are happening. That is, the situation I am talking about is one where the user says that linker shenanigans are happening but the compiler is not made aware of this in any way and hence fails to account for it.

The case I was hoping to discuss here is the one where the step actually does something. The question is whether the compiler gets to assume that the initial state of the AM looks a certain way or not. The more control we want to give the user over this initial state, the less the compiler can assume. If we say nothing, then the initial state is fixed by the Rust program written (e.g. in MiniRust, this happens here), which gives the compiler the license to make assumptions about it; this discussion is about whether we are fine with that or whetver we want to add the infrastructure that is required to not have a single fixed initial state.

Put another way, I would like us to define things such that as long as from the compiler's perspective it is a regular program, in the sense that all the inputs indicate that no funny business is happening, then the compiler should also be licensed to assume that no magic user-defined step breaks this, and optimize accordingly. This is a nontrivial constraint - it means that if the user wants to do linker shenanigans they need to use some kind of attribute or asm block or something. These kind of user-defined semantics need to be opt-in.

@CAD97
Copy link

CAD97 commented Mar 18, 2023

I think it's essentially a requirement that we permit some amount of extra AM state to be provided by the host.

@digama0: what benefit do you actually see requiring code to say "some arbitrary shenanigans happen before main" having? Note that std's initialization in lang_start before the user main is called includes some extern fn calls (e.g. naming the main thread), so unless the compiler can see through those, it can't assume they didn't modify externally visible state, or even call externally linked functions. Separate compilation makes the compiler knowing "nothing has happened yet" extremely niche. Accounting for arbitrary shenanigans going on is the default state of the compiler; any optimizations have to prove that there aren't problematic shenanigans going on. (The purpose of UB is carving out semantic space where the user code promises not to be doing any shenanigans, in order to make proving the absence of shenanigans somewhat practical.)

(With a disclaimer that I'm not doing and know very little about embedded or any other donations that require AM-opaque behavior, and that the footnotes are painfully rambling about this field I know little about (so probably ignore them),)

While we could say that any host-provided resource needs to be declared to the Rust AM via some extern symbol, it's reasonably established that embedded wants to be able to do e.g. VolAddress::new(0x0400_0006) and access a known memory address, instead of using an extern static or similar to inject the location via the linker1. There's various ways to model known magic addresses without making them extern static in the face of strict provenance: one is "assume_alloc", where the AM assumes there's an allocated object at some address, but this would presumably take root ownership of the object, which isn't correct for MMIO; the simpler is just to say that the magic addresses are exposed addresses, and use from_exposed_addr.

However, if we require the AM to enter lang_start in the pure unmodified abstract initialization state, then it starts without any exposed addresses. If we want embedded to be allowed to from_exposed_addr known magic addresses, then the host MUST be able to impact the AM initialization state.

Starting with some addresses exposed is, to be fair, nearly a trivial capability7, since it's effectively purely additive (and also only impacts code already reliant on angelic nondeterminism). But still, if we take that capability as a given, then I (as stated before) see no reason that it should be allowed, but modifying the initial state of externally linked static mut isn't; it's already necessary to define a "serialized initialization state" separate from the actual state upon entering lang_start. Additionally, just this capacity is still rather limited; for example, it doesn't necessarily need to extend to calling externally linked functions nor even other arbitrary AM operations (though what could even be observed that isn't static mut or require going through extern fn, I'm not sure, if it even exists). However, the simplest way to model the behavior really is just to say lang_start is an externally linked function like any other, and the host is perfectly capable of doing whatever operations it wants after starting up the AM, which may include doing whatever it wants before calling lang_start.

Because Rust is a "low level" language without a runtime9, the definition can get away with being that simple. The output of compilation can be as vague as "something which can be linked," and the linker is then in charge of (doing any linker shenanigans and then) turning that into a format which the target is capable of executing. The difference between a bin and a cdylib output can (and imho should) be purely in target-specific details, and that the former exports a lang_start which is expected to (but not required) to be the first (and last) AM operation done when the binary is executed in the target-specific manner.

This doesn't cover the further linker shenanigans that people would like to be able to do12, but those are irrelevant to the topic at hand (initial AM state) and significantly more involved.

Footnotes

  1. Moreover, depending on the exact behavior of the memory location and semantics of extern static, using an extern static could potentially be incorrect for some MMIO. Namely, AIUI extern static are still expected to act like AM-native static, and I'm not confident either way whether that's fully sufficient for all MMIO schemes. In practice I expect it wouldn't ever not work2 since volatile typically implies very "hands off" optimization, but giving the compiler less potentially falsifiable information makes me more comfortable. You could link in an extern which is the pointer to the MMIO, but now you're requiring the accesses to be indirect even though the address is statically known, just to ensure the Rust AM can't think it's normal memory. Additionally, even with the original location being extern static (instead of the address being a static), you still preclude const manipulation of the address (e.g. for pointer tagging schemes or buffer subsetting) and rely on constant folding optimizations to see through such. (On the other hand, using a static permits ASLR-like schemes, which are probably a good idea even in nano-scale embedded; with virtual memory you can even relocate physical MMIO, and only the kernel managing the memory mapping needs to use the constant physical addresses.)

  2. The closest scenario I can think of at the moment3 (excluding not applying mut, which is just clearly incorrect usage) is that given an extern static mut of a type bool (any niched type), a volatile read at type u8 could potentially still assume the value is a valid bool if no prior extern calls or volatile has occurred, since the static presumably would have to be a valid bool at startup. However, I don't think that's a possible scenario; I think the reasonable way to model volatile in the AM is by treating x = intrinsics::volatile_load(ptr); as essentially extern::will_read(ptr); x = *ptr; extern::did_read(ptr); (though this isn't quite sufficient on its own4) which would prevent the "value must be the same as at startup" fact from being assumed.

  3. ... Actually I came up with a simpler one: just like it's valid to insert spurious reads and writes through a reference or pointer that's been nonatomic&nonvolatile accessed, it's probably valid to insert spurious/speculative reads of a static, even if it's extern and mut. This would be incorrect for some MMIO which has side effects for reads. (No sane compiler would introduce spurious reads if only volatile reads are done, since that would not have any optimization benefit, probably, but we're talking semantics, not likely behavior.)

  4. Volatile is a complicated mess (and volatile access to Rust Allocated Objects even moreso) since its definition is basically just "don't optimize this." AIUI it's considered a nonatomic sequence point, but it's unclear the exact extent of what that means; for example, a model which permits volatile access to do other AM operations (i.e. as if an extern function call) would include potentially causing atomic synchronization5. The best I can see that maintains the ability to reorder nonvolatile potentially-observable operations over a volatile one is for any AM-impacting operations to be queued up to happen at the next extern function call, rather than immediately on the volatile access6.

  5. Do we need volatile + atomic pointer read/writes? Or would the way to model a synchronization-causing volatile access that be combining volatile access with compiler fences? Is this even an actual possibility or just me hallucinating a sufficiently cursed scenario? The most tricky case I know I saw someone mention was a scheme where writing a pointer to some MMIO would trigger a coprocessor to write into the buffer at that address (essentially in another thread of execution); that feels like it requires an atomic Release edge to have any chance of functioning.

  6. (tangential thought to a tangential thought to ...:) In a world where we say extern calls only do what they actually do (i.e. as cross-language link time optimization gets more capable), we probably need an asm!-like / compiler-fence-like intrinsic (that is visible to LTO) that maintains the "anything goes" semantics. An asm! which clobbers everything plus a compiler fence might actually be sufficient (modulo LTO-visibility); at this point this is an exercise in me making up increasingly worrying cursed edge cases to get mad about.

  7. On the other hand, it's potentially not as simple as just starting with a non-empty exposed addresses list, depending on how exactly you model memory on the AM. If memory is always AM-managed state (and for statics at least, this makes sense; for volatile MMIO it's maybe somewhat strenuous8), then the memory state of any extern statics need to be loaded into the AM state. (The alternative is that "actualized" memory allocations and accesses are serviced by the host at least partially outside the scope of the AM, but even then, "nonactualized" (elided) allocations/accesses are still entirely within the AM, so the AM-managed memory state still has to exist in some capacity on top of the "actualized" host-provided memory.)

  8. While writing this I've considered the modeling of volatile MMIO a lot; almost certainly beyond what I'm qualified to. A modeling for volatile load of extern::will_load(ptr); _0 = *ptr; extern::did_load(ptr); makes some amount of sense. These magic sentinels would still need to be a bit different than other external function calls, perhaps making this "lowering" not so useful, but consider that we can then model MMIO as similar to the map/unmap of GPU memory: will_load does the actual physical MMIO and puts it into AM memory behind ptr, and did_load invalidates the object for accesses until the next volatile access happens. (Justifying treating nonvolatile accesses as UB under this model is what makes the sentinels not normal extern functions, along with the fact they're presumably not allowed to cause any sort of atomic synchronization. It's thus probably simpler to say that part of a pointer's provenance is a bit saying if it's an MMIO address; if it isn't, accesses are normal and go to AM memory state; if it is, accesses are extern fn calls to the host to perform the MMIO.)

  9. Allegedly; definitions are hard. For the purpose here, that you can compile Rust code to a native library and call into it not through lang_start; that there isn't meaningful "life before main." (Note that std is not the language for this purpose, despite being highly integrated, and as it's the definer of lang_start, can, and does10, do some work before calling the user main.) And yes, since C++ allows you to define runtime initialization for static globals that runs before main, it's not "low level" by this definition (if you're (or your upstream or std's) using that functionality; the other more key property of "low level" languages is the "zero overhead" rule that you don't pay for what you don't use, which C++ does reasonably well (at least if you include the nonstandard compiler flags to disable RTTI and exceptions' overhead).).

  10. Namely, the std runtime may, depending on the target/OS, do some runtime initialization of stack probes, stash environment state (e.g. argv/argc), name the main thread, set signal handlers, and/or other target/OS specific initialization. That std says any std functionality may or may not work before/after a Rust-defined main is essentially necessary, but it's a known unfortunate deficiency that we don't define what functionality may cause problems outside of main. Despite common knowledge saying it's not, both are possible with only stable Rust: life before main is accomplished by writing main in another language and calling into Rust11, and life after main can be achieved on at least some targets via threadlocal destructors (notably including Windows, but excluding pthread unixes).

  11. It'd probably be beneficial to define some sort of unsafe fn std::os::{os}::init(state: InitState) so that there's at least the availability to initialize the state (or at least that which doesn't impact Rust-external global state, so yes to stashing argc/argv, no to changing signal handlers) without using a Rust-defined entry point. Currently, this is in std::rt::init/std::sys::init. (The actual application would be responsible for defining and calling an extern "C" entry point that invokes the initialization.) It'd also be a good place to document what functionality does(n't) rely on that initialization (although defining the negative while allowing new functionality which may require it is difficult).

  12. Notable examples include ctor (running code before/after main; accomplishable via this decision if we allow arbitrary AM actions before/after main) and linkme::DistributedSlice (utilizing custom linker sections to get a slice of multiple statics).

@digama0
Copy link

digama0 commented Mar 19, 2023

There is a lot to respond to there, but I guess a lot of it would take us off topic so I will restrict attention to static, extern static and volatile accesses.

@digama0: what benefit do you actually see requiring code to say "some arbitrary shenanigans happen before main" having? Note that std's initialization in lang_start before the user main is called includes some extern fn calls (e.g. naming the main thread), so unless the compiler can see through those, it can't assume they didn't modify externally visible state, or even call externally linked functions. Separate compilation makes the compiler knowing "nothing has happened yet" extremely niche. Accounting for arbitrary shenanigans going on is the default state of the compiler; any optimizations have to prove that there aren't problematic shenanigans going on.

I agree that in the majority of cases the compiler won't be able to propagate the initial value of a static. That is a natural consequence of there being so many potential middle men before getting to the main function, and if that's the reason why the compiler can't optimize then so be it. But I don't think that this is sufficient reason to make it impossible to ever do this kind of reasoning - it should be possible to have "hermetic" environments like Miri, in which these kind of reasoning steps are more feasible.

Another example was brought up by @bjorn3 : propagating the value of a private static. I think there is still some work to be done to define the aliasing model to allow language-private statics to actually be UB to access without permission so that this is feasible, but I think we will want something like that anyway to enable moving const to static. (Of course, most of the time it's not a mutable static so the situation is much simpler. But I don't think static mut should be treated any differently from a static with interior mutability, and in particular it should be a Rust allocation.) Assuming we have a useful provenance for private statics, we should be able to say that even the extern code in std::rt::init can't touch it.

While we could say that any host-provided resource needs to be declared to the Rust AM via some extern symbol, it's reasonably established that embedded wants to be able to do e.g. [VolAddress::new(0x0400_0006)](https://docs.rs/voladdress/latest/voladdress/) and access a known memory address, instead of using an extern static or similar to inject the location via the linker.

This brings me to "Rust allocations". The basic premise is that some memory is AM-managed and some is provided externally. Heap and stack memory are rust allocations, VolAddress::new(0x0400_0006) is an external allocation. extern static is used to point to an external allocation, static and static mut point to a Rust allocation.

An external allocation is memory which is made available to the AM without there having been an actual allocation event. They may be available at the start of execution, or they may be made available during the course of execution as a result of an OS call. These are usually (always?) exposed addresses, although it includes things like the top of the stack (argv, envp) which are regular memory which is "just there" at startup.

We usually cannot make any guarantees about what external allocations exist, this is a function of the host parameters. (Although one would hope that the allocator at least knows where they are and how to avoid them!) They don't need to be declared in advance, and this is how I would justify VolAddress. You can also use extern statics to access an external allocation - this is basically the same situation, we are just using the static to help us find the location of the allocation. (It is unclear whether these statics should have nontrivial provenance to prevent leaping from one external allocation to another, but I would err on the side of making them all accessible with the same exposed tag.)

But still, if we take that capability as a given, then I (as stated before) see no reason that it should be allowed, but modifying the initial state of externally linked static mut isn't; it's already necessary to define a "serialized initialization state" separate from the actual state upon entering lang_start.

I assume by "serialized initialization state" you mean something like an abstraction of the binary executable generated by rustc. In which case I suppose you would say that even though the guarantee of the example program from before is only that it prints the value of FOO which exists in the "actual state" upon entering lang_start (assuming we can at least argue that rt::init doesn't change FOO), which, we should note, is not an invariant which depends on the initializer value in any way, the compiler does have to promise that 0 actually shows up in this "serialized initialization state", and hence changing the 0 to a 1 would be required to produce a different binary even though it has exactly the same runtime guarantee. Is that right?

While I generally approve of the strategy of separating the serialized initial state from the runtime initial state, I think we will need to be very careful about introducing too much flexibility in this step, because it has almost complete debugger-like freedom to observe anything at all about how the binary is structured, and this is an optimization killer. The direction I am advocating here is that "Rust allocations" have to be read and written using mechanisms that the compiler is aware of, while "external allocations" are set up in a host-dependent way and the compiler has little insight into them.

@RalfJung
Copy link
Member

@digama0 I still don't understand what you are concerned about here.

Of course if the user declares there to be some "initial step" that is happening, it has the responsibility of actually arranging the physical machine state to match. This works exactly like asm blocks, where the user declares which state change they incur to the AM, and has to ensure the actual asm content matches that.

it means that if the user wants to do linker shenanigans they need to use some kind of attribute or asm block or something. These kind of user-defined semantics need to be opt-in.

What is the benefit of making them opt-in?

The part that makes it not easy is that if we say that the user can do linker shenanigans with their mind then the compiler can't link programs correctly that have no indication that linker shenanigans are happening.

The compiler as it works today implements the suggested spec correctly (to my knowledge). So it is indeed trivial to link such programs correctly. You just have to not make any arguments based on "I know the initial value of this static mut, hence X".

Are we even talking about the same situation? I am still puzzled about you perceiving the situation so differently. You keep claiming this spec is impossible to implement, but it is in fact already implemented by today's rustc.

@digama0
Copy link

digama0 commented Mar 19, 2023

The compiler as it works today implements the suggested spec correctly (to my knowledge). So it is indeed trivial to link such programs correctly. You just have to not make any arguments based on "I know the initial value of this static mut, hence X".

Can you explain in more detail how you would argue that a compiler that puts the wrong values in static mut initializers and says "user, you deal with it" is not conforming?

Are we even talking about the same situation? I am still puzzled about you perceiving the situation so differently. You keep claiming this spec is impossible to implement, but it is in fact already implemented by today's rustc.

I don't think it is impossible to implement this, but it does make some future things impossible that I would rather not bake into the spec. With this, Miri and other closed-world targets have to buy in to nondeterminism even when none is requested. (I hope you would agree that starting every Miri program with from_exposed_addr wouldn't be a good idea?) I would like it to be possible to have closed-world rust interpreters which include static mut (and a compiler that can in principle propagate results across ~whole programs). Nothing about static mut intrinsically implies "compiler barrier" to me; even the extern calls are only compiler barriers for practical reasons.

What is the benefit of making them opt-in?

This makes the difference between this kind of indeterminate value being a very rare situation confined to people who like to play with linker settings, to something that every single rust program (containing a static mut or static with interior mutability) will have to consider. Sure, you can say this is easy for the user because they can just make up safety conditions, but conversely it is an impassable compiler barrier, even if you know absolutely everything about the program and execution environment.

@CAD97
Copy link

CAD97 commented Mar 19, 2023

a compiler that puts the wrong values in static mut initializers

That's absolutely not what anyone is saying? If you don't do any shenanigans, the initial value of static mut absolutely is guaranteed to be whatever the constant initializer is.

When we say "isn't required," we're talking about the guarantee from the user to the compiler, not the compiler to the user. A compiler would still be wrong to initialize static mut to a different value without user action.

It's not "you have to accept compilers will change the value," it's "you're allowed to change the value."

@RalfJung
Copy link
Member

With this, Miri and other closed-world targets have to buy in to nondeterminism even when none is requested.

Miri will only support the case where the statics match the value declared in Rust. Developers that decide to use another initial state won't be able to directly use Miri.

This is not fundamentally new; developers that run Rust in a larger context (e.g. with threads running non-Rust code) similarly cannot use Miri.

This makes the difference between this kind of indeterminate value being a very rare situation confined to people who like to play with linker settings, to something that every single rust program (containing a static mut or static with interior mutability) will have to consider.

That's not accurate. Only developers that want to use this feature have to even consider this. I thought we were all pretty clear about that?

You are entirely misrepresenting what we are suggesting to do here, and I don't understand where this misunderstanding comes from.

@digama0
Copy link

digama0 commented Mar 20, 2023

Miri will only support the case where the statics match the value declared in Rust. Developers that decide to use another initial state won't be able to directly use Miri.

And in what way is that conforming to the spec? Suppose the user does it anyway and gets a "wrong" result because the initializer value didn't actually change after some linker shenanigans that "should have worked". Did they cause UB, or is the target not conforming? Can targets say "it is UB to do linker shenanigans" and thereby enable compiler optimizations that assume the absence of such?

That's not accurate. Only developers that want to use this feature have to even consider this. I thought we were all pretty clear about that?

This is not a question of the user side of things. I agree that this not something most users will have to consider. I'm talking about the possibilities for compilers on the "analyze and propagate everything" side of the spectrum, on platforms which are sufficiently locked down to make it feasible to do that, like Miri.

It's not "you have to accept compilers will change the value," it's "you're allowed to change the value."

Or to flip the perspective, "compilers have to accept you will change the value". Hence Miri is not correct here for not accepting a change to the value. I'm arguing that compilers/targets should be permitted to say "we do not accept changes to static initializers" as an implementation-defined matter, as there are some targets (like Miri) which cannot reasonably support this.

@saethlin
Copy link
Member

I do not think that it is useful to discuss Miri here. rustc consumes linker flags, there's no reason I see that we couldn't have miri consume those same flags, figure out what effects they are supposed to have, and then modify the MIR or interpreter state before execution begins for real, to simulate the effect of a linker. I do not think that this is categorically different from the existing shims (the API for linker shenanigans is much worse being the main difference). I wouldn't be surprised if we eventually do some version of this. Rust is still quite young.

@digama0
Copy link

digama0 commented Mar 20, 2023

As I've been saying, the issue is not the flags that are passed to rustc, but rather the ones that aren't, as this proposal is enabling linker shenanigans that the compiler doesn't even know about. If it's an input to the compiler then there is no problem, it can adapt the generated code in those situations.

@digama0
Copy link

digama0 commented Mar 21, 2023

Do we need a thread about the question of whether linkers exist, or something like that? If anything that seems to be the core of disagreement here: @digama0 seems to use a model where rustc directly executes a program, so linkers (and a bunch of other aspects of how real computers work) are just not a thing; the question was phrased with a mindset closer to the real world where rustc merely produces an artifact that can be subject to further processing before turning into a running program.

(I can't resist responding to this, sorry...) It is not so much that I don't believe linkers exist but rather that the purpose of the spec is to relate executions directly to source code, which means that the whole process "rustc -> linker -> dynamic linker -> execution on target" that turns source code into behavior is within the domain of the spec, and the division between "rustc" and "linker" is an implementation defined thing which is not any more special than the division between "MIR" and "codegen" or other internal compiler stratifications. Indeed as you point out Miri is an interpreter which means that there is no such division, it goes straight from source code (well there is some pre-compilation to MIR) to behavior without all the intervening steps.

If the user decides to stick their hands in the middle of that carefully orchestrated sequence and do something to the binary (before, during, or after linking), obviously we need to control what they are legally allowed to do because you can get arbitrary behavior pretty easily by doing that. "Thou shalt not modify rust allocations without a linker attribute" seems like a sensible rule in this regard, possibly requiring refinement to cover plausible use cases.

We could say (as suggested before) that such games are only allowed for statics carrying certain attributes, such as no_mangle or link_section. (I'm not sure if link_section makes a static "public" in some sense; it seems to be the attribute used in the original example.)

I agree. By default, if you don't put any fancy attributes on a static then linker shenanigans should be quite limited (dare I say prohibited?) such that they can be treated roughly like a let mut in a stack frame far away. (I assume that you would not consider it okay to modify a binary to change a line like mov rax, 42 which initializes a let mut x = 42; in a function body to mov rax, 37. As I see it unsolicited mutation of static initializers is just as bad.)

I don't think link_section makes a static "public", but it does make it "special" in ways that might violate compiler assumptions (like making the value 0 or uninit), so the compiler would need to be pretty hands-off unless it knows very explicitly what the attributes do.

...Actually nevermind, link_section definitely can make a static public, that's exactly what linkme::DistributedSlice does. The individual allocations in the slice are all placed in the same section so they can be iterated over.

@saethlin
Copy link
Member

saethlin commented Mar 21, 2023

If you can give an example of any C/C++/similar compiler assuming the initial value of a mutable static when entering main, I'll at least reconsider my position here. I tested a couple on godbolt and didn't see it happen.

Optimizations of C/C++ statics is often brought out as an example of why LTO is useful. As far as I'm aware, godbolt always compiles code as if it is compiling a library, so I don't think you'd ever see this optimization in godbolt.

Example: https://youtu.be/p9nH2vZ2mNo?t=345 Note that Teresa says "the compiler tells the linker that main is the only exported symbol"

I know this link isn't exactly what you're looking for, we'd probably want to re-create this situation, and I'm not certain that LTO is part of this discussion.

@saethlin saethlin reopened this Mar 21, 2023
@saethlin
Copy link
Member

saethlin commented Mar 21, 2023

sigh that's what I get for trying to type out a response on a phone I guess (I tried to scroll and closed the issue)

@thomcc
Copy link
Member

thomcc commented Mar 21, 2023

Rust already can do a lot of these sorts of optimizations, due to its compilation model giving it more information out of the box than C++ compilers have with LTO. TBH, I don't know how many we're leaving on the table that this would change.

That said, I think it's reasonable to require things like #[no_mangle], #[link_section], or possibly even just #[used] to be present if folks want to do linker shenanigans on statics.

@thomcc
Copy link
Member

thomcc commented Mar 22, 2023

Rust already can do a lot of these sorts of optimizations

Also, a common optimization that C++ compilers want to do there is turn a mutable static that's never written to into a constant. I suspect that Rust has far fewer of these in the first place — people don't tend to use mutable statics in Rust if they're actually read-only.

@comex
Copy link

comex commented Mar 24, 2023

Private static muts do already get optimized by LLVM. If you go to the playground and show assembly for

static mut X: i32 = 42;
pub fn foo() -> i32 { unsafe { X } }

the generated code just returns 42 directly, because LLVM sees that X is never written to. This changes if you make X public (edit: or #[no_mangle]).

@thomcc
Copy link
Member

thomcc commented Mar 24, 2023

Right, there's no need for LTO to determine that such a thing for private variables. LTO is used to tell that that's globally true.

@CAD97
Copy link

CAD97 commented Mar 24, 2023

(footnotes are ignorable context)

To summarize my understanding of the consensus here (and I think everyone did end up agreeing on these points):

  • When "initializing1" the Abstract Machine state before execution, the host2 is responsible for initializing the memory of any statics accessible to the program.
  • For any immutable static (not declared with mut and with type which is Freeze / does not contain UnsafeCell), the initialization value MUST be the value produced by the static's constant initializer. (By the other rules governing access, this value will not be written to during execution.)
  • For any static which isn't "exposed" (irrespective of mutability), the same holds: the initialization value MUST be the value produced by the static's constant initializer.
  • For statics which are both "exposed" and mutable (declared as mut or contain UnsafeCell), the initialization value MAY be any value which is bytewise valid for the type of the static.
    • Absent some implementation- or host-specific expression of intent, the Initialization value SHOULD be the value produced by the static's constant initializer.
    • (NB: this is especially important for any types where some values satisfy validity but not the user-defined safety invariant.)
  • A static is considered "exposed" if it could be accessed by code outside of the artifact's modules.
    • We necessarily define a concept of what symbols are "externally linked" for staticlib and cdylib compilation; this can probably use the same definition.
    • This includes at least statics with linker annotations such as #[no_mangle], #[mangled_name], or #[link_section]3.
    • This excludes at least statics which have default linkage behavior and are pub(crate) or less visible.
    • statics which are pub and have default linkage behavior could potentially go either way4.
      • statics which are pub, have default linkage behavior, but are not publicly reexported/reachable are an odd edge case, but should probably be treated uniformly with reachable pub static5.
      • What symbols, if any, are "externally linked" in a bin crate type is also a weird edge case where the answer is unclear6. As are pub items from libraries which are not reexported by the bin, I suppose4.
  • Whether bytes which are beside an UnsafeCell (part of a type which contains UnsafeCell / is !Freeze) are allowed to be mutated without mut is a separate question w.r.t. how tightly we track UnsafeCell. (I have Thoughts here I need to finish writing down7.)

I'd say that the question of whether pub statics without linker annotations should be initialization-mutable should be posed to T-lang proper. (Along with whether #[ctor] is UB the language assumes #[start] is the first code run, since the latter may force the former.) For the other cases, I believe the semantics outlined here should be straightforward to understand and non-controversial to any interested party (up until a precise definition of "exposed" statics). I've not seen anyone say that Rust shouldn't be able to model such "linker shenanigans" without requiring the use of extern static to import an externally defined static, only that "linker shenanigans" should require some sort of marker (i.e. the already existing linking attributes).

Whether it's valid to assume static memory is actually initialization-mutated by #[link_section = ".uninit"] (as in the linked OP) then becomes a case of the user relying on implementation-specific behavior. While I maintain the linked OP case of weakly persistent static memory is probably better modeled with an extern static, this would be sufficient for targets using a linker with the desired semantics9 for the .uninit section. Miri doesn't use such a linker10, so initializes the static to the constant initializer's value (in the linked OP, uninitialized bytes).

Footnotes

  1. Initialization of the AM is exclusively a host concept, as initialization is anything which occurs strictly before any AM operation observes or otherwise depends on the initialized state. Or in other words, the AM behaves as-if it always existed in the initialized state until it evaluates the first AM operation. Alternatively, initialization-mutation could be modeled as some sequence of AM operations done by external code before calling lang_start. The choice of which doesn't directly8 matter for this discussion, but does matter in the large. (Are other Rust functions allowed to be executed before #[start] (e.g. #[ctor])?)

  2. Or replace "host" with "implementation" if you prefer. I split "host" from "implementation" to capture the split between compilation of Rust code to host executable and actual execution of the host artifact. The implementation is responsible for generating an artifact that instructs the host to perform any necessary initialization it wouldn't already do, and the user is responsible for not impeding the implementation from doing so, but is allowed to modify the artifact in ways that do not break the contract.

  3. I think I would allow mutable pub static to be initialization-mutable, as the cost to doing so is relatively low and allows the definition of "exposed" to be more uniform. Also, it just follows from my preferred model of the bin crate type as a variant of staticlib6. The alternative is to require the use of a non-default linkage annotation (perhaps #[used]) as a marker for enabling initialization-mutability. This allows lib crate pub static to get the same treatment as statics in the bin crate which can be pub(crate) since they're the compilation root (barring shenanigans treating executables as libraries). 2

  4. Treating unreachable pub uniformly with reachable pub keeps initialization-mutability a local question instead of global; we have pub(crate) for if this ever actually matters and which should ideally be used for unreachable pub statics already. Also, IIRC the pub-reachability analysis used by the unreachable pub lint still has some false positives of considering reachable pub unreachable when the reexport path is too complicated, so rustc needs to conservatively treat all pub items as reachable anyway.

  5. My main argument for why initialization-mutability should be allowed is that it results from treating binary artifacts identically to library artifacts. If binaries are just libraries with a little extra glue to conventionally start execution at #[start], then there's nothing preventing the user from inserting additional initialization before #[start] such as the mutation of externally accessible statics or even calling externally accessible functions8 (e.g. via #[ctor]) for "life before main" by user fiat. On the other hand, main/#[start] are already quite special (they don't need to be pub, for one), it's reasonably well known that Rust doesn't have "life before main" (and doesn't guarantee std to function before main), and this model of how binaries (and especially pub in binaries) work is likely uncommon (despite being fairly accurate at least on unixes, IIUC). 2

  6. The conclusion is that I think UnsafeCell's effect should always be infectious, like PhantomPinned acts today. Then, exposing Freeze as an unsafe trait becomes meaningful (though the privacy/stability concerns still apply) and implementing it could be used to (highly unsafely) cover/remove UnsafeCell mutability. Do the same for UnsafeMut/FreezeMut, splitting that concern out from Unpin. I'm working on writing my reasoning out.

  7. If the model allowing completely arbitrary Rust code to be run before #[start] is taken, this effectively removes any potential "I know the value" optimization based solely on being early in main before any writes have occurred. The concept of an "exposed" static remains relevant to optimization, however; if the static is known not to be modified, it can be optimized to remove the mutability. Knowing this is only possible if the static isn't potentially exposed to external code. 2

  8. Whether the section actually provides the semantics is yet another question. Given the often surprising behavior of uninitialized memory, even on hardware, this seems risky at best. The infamous MADV_FREE case could potentially be replicated here, where the static is located in an unmapped page that serves loads as (typically) zero until the page is first written to and thus actually mapped to a real page, at which point loads now get the value from that page, which is potentially nonzero. This (loading inconsistent values) is absolutely normal and allowed behavior for uninitialized memory (the observation of which from the Rust AM requires the as-of-yet theoretical freeze operation or else UB from treating uninitialized memory as initialized), but is absolutely prohibited for initialized memory.

  9. It might be a good idea for Miri to warn when encountering a #[link_section] attribute, as that almost certainly means that the code is doing something Miri has little-to-no chance of emulating as the code intends. This might be a bit noisy for something like #[distributed_slice], but seems generally useful.

@digama0
Copy link

digama0 commented Mar 24, 2023

(I agree with the summary.)

I think I would allow mutable pub static to be initialization-mutable, as the cost to doing so is relatively low and allows the definition of "exposed" to be more uniform. Also, it just follows from my preferred model of the bin crate type as a variant of staticlib.

Just to be clear, I think this would block the LTO optimization demonstrated earlier, right? Is there ever a point at which it is possible to look at the "whole program" and say that even things that are pub are nevertheless local to the executable (because it's not a shared library)? If the static is defined in a library and used in an executable, then I think it would have to be a pub static, so there seems to be an expressive gap without something like #[used].

@bjorn3
Copy link
Member

bjorn3 commented Mar 24, 2023

Is there ever a point at which it is possible to look at the "whole program" and say that even things that are pub are nevertheless local to the executable (because it's not a shared library)?

We explicitly tell the linker which symbols are exported and pass the same list to LLVM when doing LTO. For dylib this is all symbols exported from any codegen unit, for cdylib and staticlib this is just the #[no_mangle] symbols and for bin only main is considered exported unless -Zexport-dynamic-symbols is used. Anything not in this list is to be considered non-exposed to anything except all the object files involved in the linking step itself. Be they come from Rust or C.

@CAD97
Copy link

CAD97 commented Mar 24, 2023

At least somewhat interesting that that doesn't include #[link_section] symbols. (I would expect #[link_name] to also get exported like #[no_mangle], and I believe #[used] gets exported from object files as well).

I will maintain a preference for matching the definition of symbols that would be from exported staticlib/cdylib over my looser concept of what statics should be considered exposed. However, there still needs to be some way sufficient to allow linker shenanigans on binaries, otherwise they're effectively not possible.

(The #[used] RFC says #[used] gets a static into the object files even in the presence of LTO and to the linker, but does not impact the behavior of the linker assembling the final executable on its own.)

@digama0
Copy link

digama0 commented Jul 26, 2023

Related: #215

@moridinga
Copy link

moridinga commented Jan 28, 2025

Curious if the current Rust language definition, as of 28-Jan-2025, can make any guarantees about not optimizing relative to the initializer expression for static mut objects. We have basically exactly the same issue that prompted the original question. On an embedded platform, we have a log that was populated by the boot process and stuffed into SRAM. It is treated as a [u8; SIZE] array. We use the linker and the #[unsafe(link_section = ".log_buffer")] attribute to locate this array in memory at a fixed address. We need to get a reference to that array in our Rust code (the code that was booted) so that the log data can be printed, analyzed, whatever. This data is initialized already - it was done during the boot process. The Rust code should not initialize it, it should not optimize as if it knows what values it contains, but it does need to read it and mostly it needs to treat it like volatile content. Right now we are wrapping this array up in an SyncUnsafeCell<MaybeUninit<[u8; SIZE]>>, and providing a fixed initializer as required for the array type. What Rust seems to do is to assume that this content is pre-initialized by the runtime setup in the same way that static global data would be for .bss or .data sections (and if I don't use the link_section then that is precisely where this content ends up in the generated object files, depending on whether the initializer is zeros or not), and therefore doesn't do it's own explicit initialization.

All that is to say, what I'm doing seems to work, but I worry that that may not always be true. I'm curious if the use of the UnsafeCell is basically resulting in a guarantee that compiler/optimizer can't assume it knows the memory contents even though it sees the initializer expression (which isn't actually used).

@RalfJung
Copy link
Member

RalfJung commented Jan 28, 2025 via email

@moridinga
Copy link

I will experiment, but my understanding is that extern static won't result in allocation (which would be necessary for creating the section in the generated object file) and is instead more like a C extern declaration (which just says somebody somewhere is providing this symbol of this type).

@bjorn3
Copy link
Member

bjorn3 commented Jan 28, 2025

In your linker script you can allocate some memory in RAM and then define the symbol corresponding to the static to be at the start of this memory. You need a linker script anyway to ensure that it doesn't get cleared on reboot.

@RalfJung
Copy link
Member

If the data is initialized already, then it must also already be allocated.

IIUC, in your case the data is already there before Rust starts, but because Rust happens to assume that memory is 0-initialized, and since you are writing a 0-initializer in Rust (or an initializer that leaves the memory uninit), it happens to be the case that if you do not zero-initialize the memory then you can read whatever the actual contents of this section are. That's definitely UB, we don't guarantee that we'll do an optimization like that -- the program might in fact write out the initializer value upon startup even if it is all-0. This issue is about potential other code running after program loading but before main that might change these values, which may be okay in some cases since we can't know which other threads are running, but I don't think we can or want to guarantee what happens when you play tricks with the bss zero-init default.

@moridinga
Copy link

We absolutely have a linker script (what embedded SW doesn't? :) ). But the linker doesn't do allocation, it just assigns sections to memory. The sections are coming from the object files though. But perhaps this is the problem (because you're correct that I can do "allocation" in the linker script). I think with Rust extern static that would be required, but I still need to run those experiments.

The data absolutely is there before Rust starts. In the same way that .data gets populated with initialized non-zero data and .bss gets zero'd out before the first line of Rust code starts, this particular memory is also populated. The only difference compared to .data and .bss is that Rust (when it starts) does NOT know what is in that memory, whereas it does know what is in .bss (it is all zeros) and .data (it matches the explicit initialization declared in the Rust code). My worry is that Rust will think it knows and do something with what it thinks.

@bjorn3
Copy link
Member

bjorn3 commented Jan 28, 2025

But the linker doesn't do allocation, it just assigns sections to memory.

I think you can do something like

.persistent : {
    my_global = .;
    . += 1024; /* assign 1024 bytes to the .persistent section. */
} > RAM

with 1024 replaced by whatever size you want.

@RalfJung
Copy link
Member

Rust generates an object file instructing the linker to fill a particular region of memory with 0s. But then actually you set things up in a way that the memory contains other data. Is that correct?

That does seem questionable, though maybe one can argue that it is equivalent to first having it initialized with 0s and then having some outside-of-Rust code run that writes the data you actually want to see there. So as long as the static is (interior) mutable, this will probably work... but I would argue extern static + linker script is a more robust solution.

@moridinga
Copy link

But the linker doesn't do allocation, it just assigns sections to memory.

I think you can do something like

.persistent : {
    my_global = .;
    . += 1024; /* assign 1024 bytes to the .persistent section. */
} > RAM

with 1024 replaced by whatever size you want.

Yes this would be the kind of thing one would do. Unfortunately this means the size info now has to be declared in two distinct places (the linker file and the Rust code file). For more complicated composite types this could get very annoying. For simple byte arrays, still not ideal but not a show-stopper. This is why it is much preferred to have the allocation, via declaration and compilation, happen in code. And this is typically what embedded C guys do.

@RalfJung
Copy link
Member

And this is typically what embedded C guys do.

If we can get a documented guarantee from LLVM that this kind of thing is sound to do, it should be fairly easy to also provide this guarantee on the Rust level.

@moridinga
Copy link

Some godbolt output for 32-bit ARM (common embedded target):

C version: https://godbolt.org/z/rvxr86v7z
Rust pub static mut with init: https://godbolt.org/z/vorj3cYsd
Extern static Rust: https://godbolt.org/z/414MTfx1z

So the allocation doesn't happen in the last case, as predicted. The Rust warning in that case is also a bit interesting (pub static mut a: [u8; 1024]; is "not a function or static" - OK, if you say so.)

@moridinga
Copy link

Well I ended up going with something like this: https://godbolt.org/z/5qjaesz9z
The explicit initializer is MaybeUninit::uninit() rather than something like [0u8; SIZE]. Then instead of doing write() on that, followed by assume_init(), we basically just skip the write and only do the assume_init using assume_init_ref(). I feel better about this since I'm not giving Rust any false info about the underlying initialized data. In our real use case we wrap this all up in a SyncUnsafeCell to avoid the static mut complaints.

I looked at the MIR output for the two cases (explicit init vs MaybeUninit::uninit) and the alloc statements for the arrays do show the init vs uninit difference. This seems as close as I can get to the C case where the array is declared global without an initializer.

@RustyYato
Copy link

The explicit initializer is MaybeUninit::uninit() rather than something like [0u8; SIZE].

You could use an extern static right? Then you wouldn't need to provide an initializer, and this issue doesn't affect you.

@moridinga
Copy link

The explicit initializer is MaybeUninit::uninit() rather than something like [0u8; SIZE].

You could use an extern static right? Then you wouldn't need to provide an initializer, and this issue doesn't affect you.

As mentioned previously extern static is actually not preferred since it forces the "allocation", such as it is, to be done in the linker file (or somewhere else) since the extern static just declares the symbol and it's type. Then your extern static declaration also has to have the size. If it is a simple array, that means you have to have the size in both places. If it is a composite type, then compiler knows the size but the linker does not (so you just have to make sure it is big enough to hold that type). It is actually much preferred that allocation, in the sense of occupying space in the object file, happens in code and only placement in the memory map takes place in the linker (via linker command file content). Even leaving aside the idea of pre-initialization that Rust doesn't know anything about, I could imagine more complicated scenarios with more global objects assigned to the same memory sections, and then alignment (beyond just size) of those object can come into play. Frankly, the codegen tool (rustc) is much better equipped to deal with this problem. I think extern static is really not appropriate. That mainly seems like it would be useful for FFI.

@RalfJung
Copy link
Member

Well, you are doing FFI. Some other component is filling that static with data -- clearly an FFI usecase.

The explicit initializer is MaybeUninit::uninit() rather than something like [0u8; SIZE]. Then instead of doing write() on that, followed by assume_init(), we basically just skip the write and only do the assume_init using assume_init_ref(). I feel better about this since I'm not giving Rust any false info about the underlying initialized data. In our real use case we wrap this all up in a SyncUnsafeCell to avoid the static mut complaints.

FWIW for rustc it makes little difference whether the static is declared with "uninit" and some other component pre-main fills this with data, or whether the static` is declared with zeroes and some other component pre-main fills this with data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests