Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't use vmsr instruction in global_asm! on armv7r-none-eabihf without codegen-units=1 #127269

Open
jonathanpallant opened this issue Jul 3, 2024 · 8 comments
Labels
A-inline-assembly Area: Inline assembly (`asm!(…)`) A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. A-target-feature Area: Enabling/disabling target features like AVX, Neon, etc. C-bug Category: This is a bug. O-Arm Target: 32-bit Arm processors (armv6, armv7, thumb...), including 64-bit Arm in AArch32 state S-waiting-on-LLVM Status: the compiler-dragon is eepy, can someone get it some tea? T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue.

Comments

@jonathanpallant
Copy link
Contributor

See https://rust-lang.zulipchat.com/#narrow/stream/131828-t-compiler/topic/armv7r-unknown-none-eabihf.20weirdness/near/448803070 for discussion and https://github.com/ferrous-systems/armv7r-issues for a reproducer.

I tried this code:

core::arch::global_asm!(
    r#"

.section .text.startup
.global _start
.code 32
.align 0

_start:
    // Set stack pointer
    ldr r3, =stack_top
    mov sp, r3
    // Allow VFP coprocessor access
    mrc p15, 0, r0, c1, c0, 2
    orr r0, r0, #0xF00000
    mcr p15, 0, r0, c1, c0, 2
    // Enable VFP
    mov r0, #0x40000000
    vmsr fpexc, r0
    // Jump to application
    bl kmain
    // In case the application returns, loop forever
    b .

"#
);

In debug profile, this compiles OK. If you use release profile and force codegen-units=1, it compiles. On armv8r-unknown-none-eabihf, it compiles.

But, if the target is armv7r-unknown-none-eabihf and codegen-units != 1, you get this error:

error: <inline asm>:18:5: instruction requires: VFP2
    vmsr fpexc, r0
    ^

Meta

rustc --version --verbose:

rustc 1.78.0 (9b00956e5 2024-04-29)
binary: rustc
commit-hash: 9b00956e56009bab2aa15d7bff10916599e3d6d6
commit-date: 2024-04-29
host: aarch64-apple-darwin
release: 1.78.0
LLVM version: 18.1.2

or

rustc 1.81.0-nightly (6b0f4b5ec 2024-06-24)
binary: rustc
commit-hash: 6b0f4b5ec3aa707ecaa78230722117324a4ce23c
commit-date: 2024-06-24
host: aarch64-apple-darwin
release: 1.81.0-nightly
LLVM version: 18.1.7

Both have the same issue.

@jonathanpallant jonathanpallant added the C-bug Category: This is a bug. label Jul 3, 2024
@rustbot rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Jul 3, 2024
@jieyouxu jieyouxu added T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. O-Arm Target: 32-bit Arm processors (armv6, armv7, thumb...), including 64-bit Arm in AArch32 state A-inline-assembly Area: Inline assembly (`asm!(…)`) T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Jul 3, 2024
@chrisnc
Copy link
Contributor

chrisnc commented Jul 3, 2024

The same issue happens with riscv32imac-unknown-none-elf, when trying to use the "A" extension in global_asm!, so this does not seem to be an issue with a specific target, but rather how rustc handles target features for global_asm!. Adding .option arch, rv32imac makes the error go away.

$ cargo build --release
   Compiling qemu-armv7r v0.1.0 (/Users/chrisnc/src/armv7r-issues)
error: <inline asm>:7:5: instruction requires the following: 'A' (Atomic Instructions)
    lr.w t0, 0(t1)
    ^

@jieyouxu jieyouxu added the A-target-feature Area: Enabling/disabling target features like AVX, Neon, etc. label Jul 3, 2024
@Dirbaio
Copy link
Contributor

Dirbaio commented Jul 3, 2024

minimized:

#![no_std]

core::arch::global_asm!(
    r#"
.section .text.startup
.global _start
.code 32
.align 0

_start:
    vmsr fpexc, r0
"#
);

works: 'rustc --edition=2021 --crate-type lib --target armv7r-none-eabihf repro.rs -C opt-level=0 -C embed-bitcode=no'
fails: 'rustc --edition=2021 --crate-type lib --target armv7r-none-eabihf repro.rs -C opt-level=0'
fails: 'rustc --edition=2021 --crate-type lib --target armv7r-none-eabihf repro.rs -C opt-level=s -C embed-bitcode=no'

so both opt-level and embed-bitcode=no affect it. huh

@jamesmunns
Copy link
Member

I have a half-baked (read: totally uninformed goose chase) that LLVM might not be properly copying the target features when creating the TargetMachine codegen.

Following this down:

The last one says:

// SAFETY: llvm::LLVMRustCreateTargetMachine copies pointed to data

But:

extern "C" LLVMTargetMachineRef LLVMRustCreateTargetMachine(
const char *TripleStr, const char *CPU, const char *Feature,
const char *ABIStr, LLVMRustCodeModel RustCM, LLVMRustRelocModel RustReloc,
LLVMRustCodeGenOptLevel RustOptLevel, bool UseSoftFloat,
bool FunctionSections, bool DataSections, bool UniqueSectionNames,
bool TrapUnreachable, bool Singlethread, bool AsmComments,
bool EmitStackSizeSection, bool RelaxELFRelocations, bool UseInitArray,
const char *SplitDwarfFile, const char *OutputObjFile,
const char *DebugInfoCompression, bool UseEmulatedTls,
const char *ArgsCstrBuff, size_t ArgsCstrBuffLen) {
auto OptLevel = fromRust(RustOptLevel);
auto RM = fromRust(RustReloc);
auto CM = fromRust(RustCM);
std::string Error;
auto Trip = Triple(Triple::normalize(TripleStr));
const llvm::Target *TheTarget =
TargetRegistry::lookupTarget(Trip.getTriple(), Error);
if (TheTarget == nullptr) {
LLVMRustSetLastError(Error.c_str());
return nullptr;
}
TargetOptions Options = codegen::InitTargetOptionsFromCodeGenFlags(Trip);
Options.FloatABIType = FloatABI::Default;
if (UseSoftFloat) {
Options.FloatABIType = FloatABI::Soft;
}
Options.DataSections = DataSections;
Options.FunctionSections = FunctionSections;
Options.UniqueSectionNames = UniqueSectionNames;
Options.MCOptions.AsmVerbose = AsmComments;
Options.MCOptions.PreserveAsmComments = AsmComments;
Options.MCOptions.ABIName = ABIStr;
if (SplitDwarfFile) {
Options.MCOptions.SplitDwarfFile = SplitDwarfFile;
}
if (OutputObjFile) {
Options.ObjectFilenameForDebug = OutputObjFile;
}
if (!strcmp("zlib", DebugInfoCompression) &&
llvm::compression::zlib::isAvailable()) {
#if LLVM_VERSION_GE(19, 0)
Options.MCOptions.CompressDebugSections = DebugCompressionType::Zlib;
#else
Options.CompressDebugSections = DebugCompressionType::Zlib;
#endif
} else if (!strcmp("zstd", DebugInfoCompression) &&
llvm::compression::zstd::isAvailable()) {
#if LLVM_VERSION_GE(19, 0)
Options.MCOptions.CompressDebugSections = DebugCompressionType::Zstd;
#else
Options.CompressDebugSections = DebugCompressionType::Zstd;
#endif
} else if (!strcmp("none", DebugInfoCompression)) {
#if LLVM_VERSION_GE(19, 0)
Options.MCOptions.CompressDebugSections = DebugCompressionType::None;
#else
Options.CompressDebugSections = DebugCompressionType::None;
#endif
}
#if LLVM_VERSION_GE(19, 0)
Options.MCOptions.X86RelaxRelocations = RelaxELFRelocations;
#else
Options.RelaxELFRelocations = RelaxELFRelocations;
#endif
Options.UseInitArray = UseInitArray;
Options.EmulatedTLS = UseEmulatedTls;
if (TrapUnreachable) {
// Tell LLVM to codegen `unreachable` into an explicit trap instruction.
// This limits the extent of possible undefined behavior in some cases, as
// it prevents control flow from "falling through" into whatever code
// happens to be laid out next in memory.
Options.TrapUnreachable = true;
// But don't emit traps after other traps or no-returns unnecessarily.
// ...except for when targeting WebAssembly, because the NoTrapAfterNoreturn
// option causes bugs in the LLVM WebAssembly backend. You should be able to
// remove this check when Rust's minimum supported LLVM version is >= 18
// https://github.com/llvm/llvm-project/pull/65876
if (!Trip.isWasm()) {
Options.NoTrapAfterNoreturn = true;
}
}
if (Singlethread) {
Options.ThreadModel = ThreadModel::Single;
}
Options.EmitStackSizeSection = EmitStackSizeSection;
if (ArgsCstrBuff != nullptr) {
int buffer_offset = 0;
assert(ArgsCstrBuff[ArgsCstrBuffLen - 1] == '\0');
const size_t arg0_len = std::strlen(ArgsCstrBuff);
char *arg0 = new char[arg0_len + 1];
memcpy(arg0, ArgsCstrBuff, arg0_len);
arg0[arg0_len] = '\0';
buffer_offset += arg0_len + 1;
const int num_cmd_arg_strings = std::count(
&ArgsCstrBuff[buffer_offset], &ArgsCstrBuff[ArgsCstrBuffLen], '\0');
std::string *cmd_arg_strings = new std::string[num_cmd_arg_strings];
for (int i = 0; i < num_cmd_arg_strings; ++i) {
assert(buffer_offset < ArgsCstrBuffLen);
const int len = std::strlen(ArgsCstrBuff + buffer_offset);
cmd_arg_strings[i] = std::string(&ArgsCstrBuff[buffer_offset], len);
buffer_offset += len + 1;
}
assert(buffer_offset == ArgsCstrBuffLen);
Options.MCOptions.Argv0 = arg0;
Options.MCOptions.CommandLineArgs =
llvm::ArrayRef<std::string>(cmd_arg_strings, num_cmd_arg_strings);
}
TargetMachine *TM = TheTarget->createTargetMachine(
Trip.getTriple(), CPU, Feature, Options, RM, CM, OptLevel);
return wrap(TM);
}

doesn't do the copying. I'm trying to hunt down where in LLVM this copy would actually take place, in the createTargetMachine code.

@jamesmunns
Copy link
Member

@Dirbaio tried leaking the feature flags, so it's probably not the "llvm doesn't copy the data right" thing I was guessing. Leaving the breadcrumbs in case it's useful for anyone following the codegen process down.

@thejpster
Copy link
Contributor

This is possibly a dupe of #80608

@Dirbaio
Copy link
Contributor

Dirbaio commented Jul 3, 2024

i've narrowed it to this line. If that runs, compilation fails.

let thin = ThinBuffer::new(llmod, config.emit_thin_lto, config.emit_thin_lto_summary);

so it's LTO-related, yep. Seems similar to #80608 though the compilation does abort here. Probably root cause is llvm/llvm-project#61991 too.

@Dirbaio
Copy link
Contributor

Dirbaio commented Jul 3, 2024

narrowed it down to https://github.com/rust-lang/llvm-project/blob/96aca7c51701f9b3c5dd8567fcddf29492008e6d/llvm/lib/Object/ModuleSymbolTable.cpp#L96

the target features string there is empty. If I hardcode it to "+vfp3d16" the error goes away, that confirms the issue is there.

@chrisnc
Copy link
Contributor

chrisnc commented Jul 6, 2024

As in #80608 (comment), the workaround is to add assembly directives in the global_asm! block to enable target features. In this case it would be .fpu vfpv3-d16, which armv7r-none-eabihf enables by default.

@jieyouxu jieyouxu added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. S-waiting-on-LLVM Status: the compiler-dragon is eepy, can someone get it some tea? and removed needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. labels Aug 13, 2024
alloncm added a commit to alloncm/MagenBoy that referenced this issue Dec 19, 2024
The problem was this bug in rust/llvm - rust-lang/rust#127269

Also fix the git version not updating when the common is not rebuilt by
setting it to rebuild upon git head changing
alloncm added a commit to alloncm/MagenBoy that referenced this issue Dec 19, 2024
The problem was this bug in rust/llvm - rust-lang/rust#127269

- Fix the git version not updating when the common is not rebuilt by setting it to rebuild upon git head changing.
- Delete .cargo/config.toml and replace it with build script to make it simpler
- Move the config.txt output to the artifacts folder
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-inline-assembly Area: Inline assembly (`asm!(…)`) A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. A-target-feature Area: Enabling/disabling target features like AVX, Neon, etc. C-bug Category: This is a bug. O-Arm Target: 32-bit Arm processors (armv6, armv7, thumb...), including 64-bit Arm in AArch32 state S-waiting-on-LLVM Status: the compiler-dragon is eepy, can someone get it some tea? T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

7 participants