Work around custom section flag regression in LLVM #90326

nikic · 2021-10-26T20:43:28Z

LLVM 13 has removed support for setting custom section flags in https://reviews.llvm.org/D100944. LLVM 13 carries a revert of the problematic change (https://reviews.llvm.org/D107216). However, LLVM 14 will no longer carry it, and we need to use alternative ways to set custom section flags.

There are currently two places affected by this. The .llvmbc bitcode section created in

rust/compiler/rustc_codegen_llvm/src/back/write.rs

Line 926 in ffba430

unsafe fn embed_bitcode(

and the .rustc metadata section created in

rust/compiler/rustc_codegen_llvm/src/base.rs

Line 41 in 612356a

pub fn write_compressed_metadata<'tcx>(

. In both cases, section flags are used to ensure that the section is not linked into the final binary or loaded into memory.

There are two ways to address this. The first one is to define the section entirely in module-level assembly. Rather than only emitting .section, we should also emit .ascii "ENCODED_SECTION_CONTENTS", as well as the symbol name and any other necessary directives (e.g. for visibility). I'm not sure exactly what is needed on different platforms here.

The second one is to use the object crate to create the object file containing the section. The cranelift backend already implements this for the .rustc section in

rust/compiler/rustc_codegen_cranelift/src/metadata.rs

Line 10 in ffba430

pub(crate) fn new_metadata_object(

. A nice side benefit of this is that the handling would be codegen-backend agnostic and could be shared between the llvm/cranelift/gcc backends.

The suggested course of action is to use the object crate for .rustc and inline assembly for .llvmbc. The reason for the latter is that the section is part of a larger object file emitted by LLVM, and the object crate doesn't seem great for modifying existing object files (as opposed to creating one from scratch). Though if we can use the object crate for everything, that would of course be great.

The text was updated successfully, but these errors were encountered:

durin42 · 2021-10-27T14:54:47Z

Quoting one of my LLVM experts: "the .llvmbc issue is an issue for Clang as well (e.g. using clang -fembed-bitcode on ELF has this problem). Clearly that ought to be solved in some manner other than inline-asm hacks or post-processing object files." So there's some appetite for a more systematic solution upstream in LLVM.

jyknight · 2021-10-27T15:04:27Z

(Jumping in from LLVM side -- I was pointed to this bug.)

clang -fembed-bitcode also causes the bitcode to end up in the binary on ELF platforms, just like Rust would without the section-flag workaround. That's perhaps not entirely useful, but the -fembed-bitcode flag is almost unused everywhere except on Apple platforms (where it's required in order to upload to appstore in some cases) -- so basically nobody really cares.

Rust seems to want to go to some trouble to avoid this outcome -- but it's not clear to me why you want the embedded bitcode in the first place -- other than on iOS/apple? Perhaps it'd be nice to fix this someday, but is it more important to Rust, than it is to Clang?

nikic · 2021-10-27T15:15:57Z

@jyknight Under some circumstances, Rust uses fat object files that contain both machine code and bitcode. The MC version is used without LTO, the BC version used with LTO.

Changing the flags that LLVM automatically applies to .llvmbc would solve (the more ugly) half of the problem, but I wasn't sure whether unconditionally using SHF_EXCLUDE/LNK_REMOVE would be fine for other use cases.

jyknight · 2021-10-27T17:21:21Z

LTO input wasn't really the intended use-case of the embedded bitcode feature. I don't think that even works at all with clang+lld, does it? The original purpose was an effectively apple-specific use-case: to allow Apple to rerun the codegen phase of compilation separately for each original object file, and then relink the binary. This would be used either in order to workaround CPU bugs with a new version of the toolchain, or to recompile an app to a new specially-designed-to-be-ABI-compatible architecture (bitcode for watchOS on ARM32 can be rebuilt for watchos on AArch64 (the later they call "arm64_32", since it uses a 32bit ABI)).

For LTO, it's better to just emit the correct object file for the linking mode/modes you're actually using. That's why it surprises me that this is something rust is running into.

The overhead of doing a fat object file (both size of the extraneous output and time to generate it) can be substantial, and additionally, the pass pipelines are different for generating pre-LTO bitcode vs pre-object-emission bitcode. (The semantics of llvmbc used by -fembed-bitcode is that it contains the pre-object-emission bitcode, not LTO bitcode).

Simply SHF_EXCLUDE'ing .llvmbc isn't a solution by itself. The primary purpose of -fembed-bitcode is intended to be that you can get the bitcode output in the final binary when you ask for it. On Mach-O platforms, this is done with special linker support: if the linker is passed "-bitcode_bundle", it takes the bitcode sections of the objects, puts them into a weird container format ("XAR"), then puts that into a section in the output binary. If that linker option isn't set, it ignores the bitcode sections in the input object files.

This doesn't really work on non-MachO platforms at the moment -- both because bitcode gets included into the output without a flag, and because when it is emitted, it's just concatenated, rather than put into a useful form.

We could potentially add proper support, but again, I don't think this is actually something Rust should be using at all...

bjorn3 · 2021-10-27T17:34:15Z

For LTO, it's better to just emit the correct object file for the linking mode/modes you're actually using. That's why it surprises me that this is something rust is running into.

We have to support libraries that are capable of LTO, but don't require it. The same standard library is used for LTO and non-LTO usage and as such needs to contain LTO bitcode, but also machine code at the same time.

bjorn3 · 2021-10-27T17:37:47Z

Simply SHF_EXCLUDE'ing .llvmbc isn't a solution by itself. The primary purpose of -fembed-bitcode is intended to be that you can get the bitcode output in the final binary when you ask for it.

Would you prefer the section to be called .rust_llvm_lto_bitcode or something like that? We don't use this section for the same purpose as -fembed-bitcode in clang it seems.

In LLVM 14, our current method of setting section flags to avoid embedding the `.llvmbc` section into final compilation artifacts will no longer work, see issue rust-lang#90326. The upstream recommendation is to instead embed the entire bitcode using module-level inline assembly, which is what this change does. I've kept the existing code for platforms where we do not need to set section flags, but possibly we should always be using the inline asm approach.

Use object crate for .rustc metadata generation We already use the object crate for generating uncompressed .rmeta metadata object files. This switches the generation of compressed .rustc object files to use the object crate as well. These have slightly different requirements in that .rmeta should be completely excluded from any final compilation artifacts, while .rustc should be part of shared objects, but not loaded into memory. The primary motivation for this change is rust-lang#90326: In LLVM 14, the current way of setting section flags (and in particular, preventing the setting of SHF_ALLOC) will no longer work. There are other ways we could work around this, but switching to the object crate seems like the most elegant, as we already use it for .rmeta, and as it makes this independent of the codegen backend. In particular, we don't need separate handling in codegen_llvm and codegen_gcc. codegen_cranelift should be able to reuse the implementation as well, though I have omitted that here, as it is not based on codegen_ssa. This change mostly extracts the existing code for .rmeta handling to allow using it for .rustc as well, and adjusts the codegen infrastructure to handle the metadata object file separately: We no longer create a backend-specific module for it, and directly produce the compiled module instead. This does not `fix` rust-lang#90326 by itself yet, as .llvmbc will need to be handled separately. r? `@nagisa`

cuviper · 2021-12-08T18:46:24Z

This was auto-closed by commit text, but it's not actually done yet.

Use module inline assembly to embed bitcode In LLVM 14, our current method of setting section flags to avoid embedding the `.llvmbc` section into final compilation artifacts will no longer work, see issue rust-lang#90326. The upstream recommendation is to instead embed the entire bitcode using module-level inline assembly, which is what this change does. I've kept the existing code for platforms where we do not need to set section flags, but possibly we should always be using the inline asm approach (which would have to look a bit different for MachO). r? `@nagisa`

nikic · 2021-12-13T13:40:51Z

Fixed by #91604 and #91654.

tstellar · 2021-12-16T05:02:41Z

For LTO, it's better to just emit the correct object file for the linking mode/modes you're actually using. That's why it surprises me that this is something rust is running into.

We have to support libraries that are capable of LTO, but don't require it. The same standard library is used for LTO and non-LTO usage and as such needs to contain LTO bitcode, but also machine code at the same time.

Is the bitcode from libraries that are compiled with -fembed-bitcode even used for LTO? #74657 (comment) implies that gold and lld both ignore the bitcode, and that the presence of the bitcode triggers a bug in the llvm plugin when linking with bfd. That makes it sound like embedding the bitcode doesn't give rust the feature that it wants.

nikic · 2021-12-16T08:20:47Z

@tstellar Normally rust does LTO as part of the compiler (despite the name). In that case the bitcode is obtained using IRObjectFile::findBitcodeInMemBuffer(), which accepts both plain bitcode and various object file types. That's the kind of LTO I was referring to there.

When linker plugin LTO is enabled, then only bitcode is produced instead:

rust/compiler/rustc_codegen_ssa/src/back/write.rs

Lines 146 to 163 in 9e1aff8

    
           } else if sess.target.obj_is_bitcode 
        
               || (sess.opts.cg.linker_plugin_lto.enabled() && !no_builtins) 
        
           { 
        
               // This case is selected if the target uses objects as bitcode, or 
        
               // if linker plugin LTO is enabled. In the linker plugin LTO case 
        
               // the assumption is that the final link-step will read the bitcode 
        
               // and convert it to object code. This may be done by either the 
        
               // native linker or rustc itself. 
        
               // 
        
               // Note, however, that the linker-plugin-lto requested here is 
        
               // explicitly ignored for `#![no_builtins]` crates. These crates are 
        
               // specifically ignored by rustc's LTO passes and wouldn't work if 
        
               // loaded into the linker. These crates define symbols that LLVM 
        
               // lowers intrinsics to, and these symbol dependencies aren't known 
        
               // until after codegen. As a result any crate marked 
        
               // `#![no_builtins]` is assumed to not participate in LTO and 
        
               // instead goes on to generate object code. 
        
               EmitObj::Bitcode

I think the case you are referring to is the !no_nobuiltins exception there, which seems like the only case that may have embedded bitcode in that case. (Possibly that case should be forcing a plain object file without embedded bitcode if the embedded bitcode causes issues? As the code is written, it's not obvious whether ObjectCode(BitcodeSection::Full) may be returned in that case.)

See also https://github.com/rust-lang/rust/blob/673d0db5e393e9c64897005b470bfeb6d5aec61b/src/test/run-make-fulldeps/cross-lang-lto/Makefile for a test that checks that the linker plugin LTO artifacts are bitcode files and https://github.com/rust-lang/rust/blob/673d0db5e393e9c64897005b470bfeb6d5aec61b/src/test/run-make-fulldeps/cross-lang-lto-clang/Makefile that does an end-to-end linker plugin LTO link and checks that LTO does happen.

We already use the object crate for generating uncompressed .rmeta metadata object files. This switches the generation of compressed .rustc object files to use the object crate as well. These have slightly different requirements in that .rmeta should be completely excluded from any final compilation artifacts, while .rustc should be part of shared objects, but not loaded into memory. The primary motivation for this change is rust-lang#90326: In LLVM 14, the current way of setting section flags (and in particular, preventing the setting of SHF_ALLOC) will no longer work. There are other ways we could work around this, but switching to the object crate seems like the most elegant, as we already use it for .rmeta, and as it makes this independent of the codegen backend. In particular, we don't need separate handling in codegen_llvm and codegen_gcc. codegen_cranelift should be able to reuse the implementation as well, though I have omitted that here, as it is not based on codegen_ssa. This change mostly extracts the existing code for .rmeta handling to allow using it for .rustc as well, and adjust the codegen infrastructure to handle the metadata object file separately: We no longer create a backend-specific module for it, and directly produce the compiled module instead. This does not fix rust-lang#90326 by itself yet, as .llvmbc will need to be handled separately.

Use object crate for .rustc metadata generation We already use the object crate for generating uncompressed .rmeta metadata object files. This switches the generation of compressed .rustc object files to use the object crate as well. These have slightly different requirements in that .rmeta should be completely excluded from any final compilation artifacts, while .rustc should be part of shared objects, but not loaded into memory. The primary motivation for this change is rust-lang#90326: In LLVM 14, the current way of setting section flags (and in particular, preventing the setting of SHF_ALLOC) will no longer work. There are other ways we could work around this, but switching to the object crate seems like the most elegant, as we already use it for .rmeta, and as it makes this independent of the codegen backend. In particular, we don't need separate handling in codegen_llvm and codegen_gcc. codegen_cranelift should be able to reuse the implementation as well, though I have omitted that here, as it is not based on codegen_ssa. This change mostly extracts the existing code for .rmeta handling to allow using it for .rustc as well, and adjusts the codegen infrastructure to handle the metadata object file separately: We no longer create a backend-specific module for it, and directly produce the compiled module instead. This does not `fix` rust-lang#90326 by itself yet, as .llvmbc will need to be handled separately. r? `@nagisa`

nikic added the A-LLVM label Oct 26, 2021

nikic self-assigned this Dec 3, 2021

nikic mentioned this issue Dec 6, 2021

Use object crate for .rustc metadata generation #91604

Merged

nikic mentioned this issue Dec 8, 2021

Use module inline assembly to embed bitcode #91654

Merged

bors closed this as completed in 9488cac Dec 8, 2021

cuviper reopened this Dec 8, 2021

nikic closed this as completed Dec 13, 2021

nikic mentioned this issue Dec 15, 2021

Can no longer set custom section flags in LLVM IR llvm/llvm-project#50551

Closed

tstellar mentioned this issue Jan 4, 2022

linker-plugin-lto stopped working in Rust 1.45.0 #74657

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Work around custom section flag regression in LLVM #90326

Work around custom section flag regression in LLVM #90326

nikic commented Oct 26, 2021

durin42 commented Oct 27, 2021

jyknight commented Oct 27, 2021

nikic commented Oct 27, 2021

jyknight commented Oct 27, 2021

bjorn3 commented Oct 27, 2021

bjorn3 commented Oct 27, 2021

cuviper commented Dec 8, 2021

nikic commented Dec 13, 2021

tstellar commented Dec 16, 2021

nikic commented Dec 16, 2021 •

edited

Loading

Work around custom section flag regression in LLVM #90326

Work around custom section flag regression in LLVM #90326

Comments

nikic commented Oct 26, 2021

durin42 commented Oct 27, 2021

jyknight commented Oct 27, 2021

nikic commented Oct 27, 2021

jyknight commented Oct 27, 2021

bjorn3 commented Oct 27, 2021

bjorn3 commented Oct 27, 2021

cuviper commented Dec 8, 2021

nikic commented Dec 13, 2021

tstellar commented Dec 16, 2021

nikic commented Dec 16, 2021 • edited Loading

nikic commented Dec 16, 2021 •

edited

Loading