Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom section generation under wasm32-unknown-unknown is inconsistent and unintuitive #56639

Open
koute opened this issue Dec 8, 2018 · 5 comments
Labels
A-linkage Area: linking into static, shared libraries and binaries O-wasm Target: WASM (WebAssembly), http://webassembly.org/

Comments

@koute
Copy link
Member

koute commented Dec 8, 2018

Assume we have a crate named dependency with the following content:

pub fn trigger() {
    submodule::call();
}
pub mod submodule {
    pub fn call() {
        #[link_section = "some-custom-section"]
        static SNIPPET: [u8; 3] = [b'X', b'Y', b'Z'];

        extern "C" {
            fn require_XYZ();
        }

        unsafe {
            require_XYZ();
        }
    }
}

And we have another crate which uses the dependency:

extern crate dependency;

#[no_mangle]
pub fn main() {
    dependency::trigger();
}

If I compile this crate like this:

$ cargo build --target=wasm32-unknown-unknown --release

and dump it with wasm-objdump then the "some-custom-section" custom section will be missing. However, if I change the dependency crate to look like this (I've moved the call function from the submodule to the crate root and even made it private):

pub fn trigger() {
    call();
}
fn call() {
    #[link_section = "some-custom-section"]
    static SNIPPET: [u8; 3] = [b'X', b'Y', b'Z'];

    extern "C" {
        fn require_XYZ();
    }

    unsafe {
        require_XYZ();
    }
}

and build the main crate again then the custom section is generated. Calling dependency::submodule::call directly instead of dependency::trigger also results in the custom section being generated.

Based on my experiments the custom section generation currently works like this:

In which crate is the custom section defined? Where is the function containing the section? How is the function containing the custom section called? Is it generated?
External crate Submodule Not called No
External crate Submodule Indirectly No
External crate Submodule Directly Yes
External crate In root Not called No
External crate In root Indirectly Yes
External crate In root Directly Yes
Main crate Any Any Yes

I have an example crate here which reproduces the issue:

$ git clone https://github.com/koute/rust-custom-section-issue
$ cd rust-custom-section-issue

# This will not generate a custom section:
$ cargo build --target=wasm32-unknown-unknown --release --features broken
$ wasm-objdump -s target/wasm32-unknown-unknown/release/rust_custom_section_issue.wasm

# This will:
$ cargo build --target=wasm32-unknown-unknown --release --features working
$ wasm-objdump -s target/wasm32-unknown-unknown/release/rust_custom_section_issue.wasm

I'm using the most recent nightly: rustc 1.32.0-nightly (4a45578bc 2018-12-07)

Could we make this somewhat consistent?

Some background: as a first step towards wasm-bindgen compatibility I'm converting stdweb's js! macro to use custom sections, however to make it not break existing downstream users I need to have either a) every custom section in the whole crate graph be generated, or b) always generated if the custom section is defined inside of a potentially reachable (at runtime) function. Otherwise I end up generating an import for a snippet for which the corresponding custom section entry doesn't exist.

cc @alexcrichton

@alexcrichton
Copy link
Member

This is currently expected behavior, albeit somewhat unfortunately. This has to do with object files and what causes the linker to pull in an object file. Long story short a custom section always goes into the current codegen unit, but if nothing else is referenced from the codegen unit then the custom section doesn't get included in the final output because the linker never looks at the codegen unit/object file.

We could try to do better perhaps manually in rustc itself by doing something like placing all custom sections in a separate codegen unit and forcing it to be included, but that doesn't mirror what custom sections do in other targets, for example.

@koute
Copy link
Member Author

koute commented Dec 10, 2018

Long story short a custom section always goes into the current codegen unit, but if nothing else is referenced from the codegen unit then the custom section doesn't get included in the final output because the linker never looks at the codegen unit/object file.

If that was the case wouldn't the custom section always be emitted when I call the function which contains it? Don't the function and the statics it contains inside of it go into the same codegen unit? Or am I simply misunderstanding this? (I'm sure there are a lot more moving parts here so it must not be so simple to explain in one sentence.)

We could try to do better perhaps manually in rustc itself by doing something like placing all custom sections in a separate codegen unit and forcing it to be included, but that doesn't mirror what custom sections do in other targets, for example.

If this behavior is the same for other targets too then I'd argue that it's, well, broken for those other targets too. As it stands right now the semantics of when the custom section is emitted depends on unintuitive (And I'm guessing not really documented nor guaranteed?) implementation details of the compiler, which is a subtle but fairly serious omission for an already stabilized feature.

@alexcrichton
Copy link
Member

It suffices to say that there's a lot of moving parts with codegen units. What's happening above is you're being thwarted by LLVM's ThinLTO passes which performs inlining and reduces dependencies between codegen units.

It's true that it's difficult to use, and that's why it's very low level! We can try to improve it over time, I'm mostly just trying to explain what's happening.

@jonas-schievink jonas-schievink added A-linkage Area: linking into static, shared libraries and binaries O-wasm Target: WASM (WebAssembly), http://webassembly.org/ labels Jan 26, 2019
bors bot added a commit to tock/tock that referenced this issue May 11, 2020
1836: Added init() function to stm32f4 crates r=ppannuto a=alexandruradovici

### Pull Request Overview

This pull request:
  1. adds an `init` function to the stm32f4 sub crates so that the object file is not ignored by the compiler and IRQS are included in the .irqs section
  2. moves the IRQS in the lib.rs file, they are ignored by the compiler otherwise

This fixes #1835.

It seems to be related to rust-lang/rust#56639.

### Testing Strategy

This pull request was tested with a nucleo429zi board.

### TODO or Help Wanted

### Documentation Updated

- [x] updates are required.

### Formatting

- [x] Ran `make formatall`.


Co-authored-by: Alexandru Radovici <msg4alex@gmail.com>
Michael-F-Bryan pushed a commit to hotg-ai/rune that referenced this issue Aug 31, 2021
@leighmcculloch
Copy link

This problem seems to be much worse on nightly lately. If I have codegen-units set very high with a stable build the link sections still get included, but recently noticed nightly builds are omitting custom sections. Unfortunately I don't have a definitive version of nightly when this started happening.

@thomcc
Copy link
Member

thomcc commented Aug 2, 2022

If the static also has #[used] it may be related to #93718.

If so, checking that it worked prior to the 2022-07-20 nightly, and is broken after the 2022-07-26 nightly (note that the the nightlies between these two had a very broken version of #[used], which got fixed by #99676) would be pretty helpful, absent an actual regression range.

The #[used] handling in that code probably doesn't need the check for wasm, and if it's causing issues I think we could remove it, so that #[used] is llvm.compiler.used on wasm like it is on ELF. The primary1 reason it does apply to wasm was just because I was trying to imitate clang's logic for implementing __attribute__((used)), which basically seemed to use llvm.compiler.used on ELF, and llvm.used for everything else.

That said, the relationship to codegen units is unclear to me, and if your static doesn't have #[used] it seems unlikely to be related (that said, if you don't have #[used], adding it might fix your issue...)

Footnotes

  1. That is, aside from the fact that in an ideal world it should be legal to use llvm.used in place of llvm.compiler.used. Sadly, we don't live in such an ideal world, as use of llvm.used can expose toolchain bugs, which we already know happens with some ELF linkers (gold).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-linkage Area: linking into static, shared libraries and binaries O-wasm Target: WASM (WebAssembly), http://webassembly.org/
Projects
None yet
Development

No branches or pull requests

5 participants