Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rustc crashes when trying to bench upcoming neon-support in RustFFT with latest stdarch. #1227

Closed
HEnquist opened this issue Oct 3, 2021 · 21 comments

Comments

@HEnquist
Copy link

HEnquist commented Oct 3, 2021

I'm working on adding neon support to RustFFT, and wanted to try the vld* and vst* instrinsics added here: #1224

First results were promising, but now I'm having a hard time running benchmarks because rustc crashes when building the benches. It crashes quite hard, without giving any useful error message. I'm using rust commit d14731c (simply master from yesterday, have also tried with a version from a couple of days ago with the same result), with stdarch updated to commit 931cdfb.

I would like to investigate this and try to at least help solve it, but I have no idea were to start. Any advice?

I'm trying to bench this branch: https://github.com/HEnquist/RustFFT/tree/vldx

I have tried on both a raspberry pi, and on an Oracle Ampere VM, with the same results.

Error:

pi@raspberrypi:~/RustFFT $ cargo bench --features neon neon_
   Compiling rustfft v6.0.1 (/home/pi/RustFFT)
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0x826e48)[0x7f919b2e48]
linux-vdso.so.1(__kernel_rt_sigreturn+0x0)[0x7f96f1a788]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0x11b83a4)[0x7f923443a4]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0x16b1810)[0x7f9283d810]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0x246e1f4)[0x7f935fa1f4]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0x246e350)[0x7f935fa350]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0x246f570)[0x7f935fb570]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0xb5d674)[0x7f91ce9674]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0xb48ba4)[0x7f91cd4ba4]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0xb4cfe8)[0x7f91cd8fe8]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0xa24380)[0x7f91bb0380]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0xa1eb14)[0x7f91baab14]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0xa95048)[0x7f91c21048]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0xaf9320)[0x7f91c85320]
/mount/ssd/rustc/lib/libstd-ba7780e7efc39bcb.so(rust_metadata_std_2d103c436cd22770+0x8c380)[0x7f9108b380]
/lib/aarch64-linux-gnu/libpthread.so.0(+0x77e4)[0x7f90e3d7e4]
/lib/aarch64-linux-gnu/libc.so.6(+0xcfadc)[0x7f90f48adc]
error: could not compile `rustfft`

Caused by:
  process didn't exit successfully: `rustc --crate-name rustfft --edition=2018 src/lib.rs --error-format=json --json=diagnostic-rendered-ansi --emit=dep-info,link -C opt-level=3 -C embed-bitcode=no --test --cfg 'feature="avx"' --cfg 'feature="default"' --cfg 'feature="neon"' --cfg 'feature="sse"' -C metadata=541f3979c27e3f0a -C extra-filename=-541f3979c27e3f0a --out-dir /home/pi/RustFFT/target/release/deps -L dependency=/home/pi/RustFFT/target/release/deps --extern num_complex=/home/pi/RustFFT/target/release/deps/libnum_complex-d7ededc7dd339a27.rlib --extern num_integer=/home/pi/RustFFT/target/release/deps/libnum_integer-0edf0ad8b3f42ac1.rlib --extern num_traits=/home/pi/RustFFT/target/release/deps/libnum_traits-7542682cf91f65c6.rlib --extern paste=/home/pi/RustFFT/target/release/deps/libpaste-49b243a423c645cd.so --extern primal_check=/home/pi/RustFFT/target/release/deps/libprimal_check-d5a43e363432ab49.rlib --extern rand=/home/pi/RustFFT/target/release/deps/librand-5cd04db47872812e.rlib --extern strength_reduce=/home/pi/RustFFT/target/release/deps/libstrength_reduce-1c2f1a65415e918e.rlib --extern transpose=/home/pi/RustFFT/target/release/deps/libtranspose-226cfc662b715315.rlib` (signal: 11, SIGSEGV: invalid memory reference)
warning: build failed, waiting for other jobs to finish...
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0x826e48)[0x7f7f990e48]
linux-vdso.so.1(__kernel_rt_sigreturn+0x0)[0x7f84ef8788]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0x11b83a4)[0x7f803223a4]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0x16b1810)[0x7f8081b810]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0x246e1f4)[0x7f815d81f4]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0x246e350)[0x7f815d8350]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0x246f570)[0x7f815d9570]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0xb5d674)[0x7f7fcc7674]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0xb48ba4)[0x7f7fcb2ba4]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0xb4cfe8)[0x7f7fcb6fe8]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0xa24380)[0x7f7fb8e380]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0xa1eb14)[0x7f7fb88b14]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0xa95048)[0x7f7fbff048]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0xaf9320)[0x7f7fc63320]
/mount/ssd/rustc/lib/libstd-ba7780e7efc39bcb.so(rust_metadata_std_2d103c436cd22770+0x8c380)[0x7f7f069380]
/lib/aarch64-linux-gnu/libpthread.so.0(+0x77e4)[0x7f7ee1b7e4]
/lib/aarch64-linux-gnu/libc.so.6(+0xcfadc)[0x7f7ef26adc]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0x826e48)[0x7fafd07e48]
linux-vdso.so.1(__kernel_rt_sigreturn+0x0)[0x7fb526f788]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0x11b83a4)[0x7fb06993a4]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0x16b1810)[0x7fb0b92810]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0x246e1f4)[0x7fb194f1f4]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0x246e350)[0x7fb194f350]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0x246f570)[0x7fb1950570]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0xb5d674)[0x7fb003e674]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0xb48ba4)[0x7fb0029ba4]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0xb4cfe8)[0x7fb002dfe8]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0xa24380)[0x7faff05380]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0xa1eb14)[0x7fafeffb14]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0xa95048)[0x7faff76048]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0xaf9320)[0x7faffda320]
/mount/ssd/rustc/lib/libstd-ba7780e7efc39bcb.so(rust_metadata_std_2d103c436cd22770+0x8c380)[0x7faf3e0380]
/lib/aarch64-linux-gnu/libpthread.so.0(+0x77e4)[0x7faf1927e4]
/lib/aarch64-linux-gnu/libc.so.6(+0xcfadc)[0x7faf29dadc]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0x826e48)[0x7faa5c3e48]
linux-vdso.so.1(__kernel_rt_sigreturn+0x0)[0x7fafb2b788]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0x11b83a4)[0x7faaf553a4]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0x16b1810)[0x7fab44e810]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0x246e1f4)[0x7fac20b1f4]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0x246e350)[0x7fac20b350]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0x246f570)[0x7fac20c570]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0xb5d674)[0x7faa8fa674]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0xb48ba4)[0x7faa8e5ba4]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0xb4cfe8)[0x7faa8e9fe8]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0xa24380)[0x7faa7c1380]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0xa1eb14)[0x7faa7bbb14]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0xa95048)[0x7faa832048]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0xaf9320)[0x7faa896320]
/mount/ssd/rustc/lib/libstd-ba7780e7efc39bcb.so(rust_metadata_std_2d103c436cd22770+0x8c380)[0x7fa9c9c380]
/lib/aarch64-linux-gnu/libpthread.so.0(+0x77e4)[0x7fa9a4e7e4]
/lib/aarch64-linux-gnu/libc.so.6(+0xcfadc)[0x7fa9b59adc]
error: build failed
@workingjubilee
Copy link
Member

Please include rustc --version --verbose.

@HEnquist
Copy link
Author

HEnquist commented Oct 6, 2021

Sure! It doesn't say that much unfortunately.

pi@raspberrypi:~ $ /mount/ssd/rustc/bin/rustc --version --verbose
rustc 1.57.0-dev
binary: rustc
commit-hash: unknown
commit-date: unknown
host: aarch64-unknown-linux-gnu
release: 1.57.0-dev
LLVM version: 13.0.0

@SparrowLii
Copy link
Member

SparrowLii commented Oct 9, 2021

Can you find out which lines of code in the bench caused the crash? This might help to find the root cause.

@workingjubilee
Copy link
Member

LLVM version: 13.0.0

This is what I wanted to make sure of.
The last "official" LLVM 13.0.0 (Rust was merging the "release candidate" versions to be able to get a head start on testing) was pulled in shortly after you posted this, so it may be a good idea to try today's rustc.

@HEnquist
Copy link
Author

It doesn't seem to matter much what my benches contain, it fails no matter what. I just started building rustc from today, will try it as soon as it's ready (tomorrow probably, takes some time on a Raspberry Pi..)

@workingjubilee
Copy link
Member

aarch64-unknown-linux-gnu is a tier 1 target: It should be possible to download the latest nightly via rustup, no? No need to recompile it.

@HEnquist
Copy link
Author

I need a newer stdarch than in the latest nightly, with all the vld* and vst* intrinsics.

@HEnquist
Copy link
Author

The updated llvm unfortunately made no difference. If I go back to the RustFFT version just before I started using the vld* and vst* intrinsics builds and benches fine. I'll try to figure out exactly what change triggers the crash. Unfortunately I'm a bit short on time these days, so may take a while.

@HEnquist
Copy link
Author

The crash seems to come when I compile a benchmark if my FFTs use this function: https://github.com/HEnquist/RustFFT/blob/vldx/src/neon/neon_vector.rs#L156
It only fails with cargo bench, with cargo test it's all good.

@SparrowLii
Copy link
Member

We can replace vld2q_f64 with a fn with equivalent behavior and see if the crash will still happen:

pub unsafe fn vld2q_f64_fake(a: *const f64) -> float64x2x2_t {
    let x: [f64; 4] = core::ptr::read_unaligned(a.cast());
    transmute([x[0], x[2], x[1], x[3]])
}

@HEnquist
Copy link
Author

Using the vld2q_f64_fake instead of vld2q_f64 makes the benches build and run fine!

@HEnquist
Copy link
Author

By the way, vld3q_f64 and vld4q_f64 cause no problems. No need for fake-versions of those to make the benches ok.

@SparrowLii
Copy link
Member

That is interesting. I think vld2q_f64 may have special requirements for align. This requires specific analysis of llvm's implementation of vld2. Unfortunately I am not good at this part.

@workingjubilee
Copy link
Member

workingjubilee commented Oct 14, 2021

Can you show the assembly emitted for each of those intrinsics, as it looks like in the final bench binary, @HEnquist? This will likely require a disassembly tool rather than relying on --emit=asm or anything. it also likely requires surrounding context in terms of assembly, hopefully not everything, just each bench test.

@hkratz
Copy link
Contributor

hkratz commented Oct 14, 2021

I have looked at this a bit and can reproduce this on a Mac M1.

Rustc crashes in LLVM codegen:

Process 26362 stopped
* thread #7, name = 'LTO bench_rustfft_neon.f5e027c6-cgu.1', stop reason = EXC_BAD_ACCESS (code=1, address=0x400010a125b10)
    frame #0: 0x0000000100530534 librustc_driver-69ff7149a4f34321.dylib`llvm::CallInst::Create(llvm::FunctionType*, llvm::Value*, llvm::ArrayRef<llvm::Value*>, llvm::ArrayRef<llvm::OperandBundleDefT<llvm::Value*> >, llvm::Twine const&, llvm::Instruction*) + 320
librustc_driver-69ff7149a4f34321.dylib`llvm::CallInst::Create:
->  0x100530534 <+320>: ldr    x8, [x25, #0x10]
    0x100530538 <+324>: ldr    x1, [x8]
    0x10053053c <+328>: cbz    x20, 0x100530560          ; <+364>
    0x100530540 <+332>: mov    w8, #0x30

Full backtrace:

(lldb) bt
* thread #7, name = 'LTO bench_rustfft_neon.f5e027c6-cgu.1', stop reason = EXC_BAD_ACCESS (code=1, address=0x400010a125b10)
  * frame #0: 0x0000000100530534 librustc_driver-69ff7149a4f34321.dylib`llvm::CallInst::Create(llvm::FunctionType*, llvm::Value*, llvm::ArrayRef<llvm::Value*>, llvm::ArrayRef<llvm::OperandBundleDefT<llvm::Value*> >, llvm::Twine const&, llvm::Instruction*) + 320
    frame #1: 0x0000000100530294 librustc_driver-69ff7149a4f34321.dylib`llvm::IRBuilderBase::CreateCall(llvm::FunctionType*, llvm::Value*, llvm::ArrayRef<llvm::Value*>, llvm::Twine const&, llvm::MDNode*) + 80
    frame #2: 0x0000000101186b50 librustc_driver-69ff7149a4f34321.dylib`llvm::AArch64TargetLowering::lowerInterleavedLoad(llvm::LoadInst*, llvm::ArrayRef<llvm::ShuffleVectorInst*>, llvm::ArrayRef<unsigned int>, unsigned int) const + 852
    frame #3: 0x0000000101538c6c librustc_driver-69ff7149a4f34321.dylib`(anonymous namespace)::InterleavedAccess::runOnFunction(llvm::Function&) + 4868
    frame #4: 0x0000000101f42030 librustc_driver-69ff7149a4f34321.dylib`llvm::FPPassManager::runOnFunction(llvm::Function&) + 672
    frame #5: 0x0000000101f477c0 librustc_driver-69ff7149a4f34321.dylib`llvm::FPPassManager::runOnModule(llvm::Module&) + 52
    frame #6: 0x0000000101f42528 librustc_driver-69ff7149a4f34321.dylib`llvm::legacy::PassManagerImpl::run(llvm::Module&) + 856
    frame #7: 0x00000001003d8a40 librustc_driver-69ff7149a4f34321.dylib`LLVMRustWriteOutputFile + 692
    frame #8: 0x00000001002e16a4 librustc_driver-69ff7149a4f34321.dylib`rustc_codegen_llvm::back::write::write_output_file::h8c4897ade22bc53c + 204
    frame #9: 0x000000010034fff0 librustc_driver-69ff7149a4f34321.dylib`rustc_codegen_llvm::back::write::codegen::with_codegen::ha82c7a362395cd34 + 116
    frame #10: 0x00000001002e4bd4 librustc_driver-69ff7149a4f34321.dylib`rustc_codegen_llvm::back::write::codegen::h8d756782e432dc6c + 2524
    frame #11: 0x00000001003140e0 librustc_driver-69ff7149a4f34321.dylib`rustc_codegen_ssa::back::write::finish_intra_module_work::h079cbdb2f84c889e + 184
    frame #12: 0x000000010030f890 librustc_driver-69ff7149a4f34321.dylib`rustc_codegen_ssa::back::write::execute_work_item::hfb8dd85525a92ee7 + 780
    frame #13: 0x00000001003bf5f4 librustc_driver-69ff7149a4f34321.dylib`std::sys_common::backtrace::__rust_begin_short_backtrace::h086be9b8ac7cc110 + 176
    frame #14: 0x000000010032eea0 librustc_driver-69ff7149a4f34321.dylib`std::panicking::try::hb23e946ef2c82654 + 52
    frame #15: 0x000000010039114c librustc_driver-69ff7149a4f34321.dylib`core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::h7660a4fef4f4ca66 + 128
    frame #16: 0x0000000107747fb0 libstd-5be8030cf9a973ad.dylib`_$LT$alloc..boxed..Box$LT$F$C$A$GT$$u20$as$u20$core..ops..function..FnOnce$LT$Args$GT$$GT$::call_once::h6f4298f91d78694f + 36
    frame #17: 0x000000010775b14c libstd-5be8030cf9a973ad.dylib`std::sys::unix::thread::Thread::new::thread_start::h947b820fbfb10caa + 36
    frame #18: 0x00000001884a7878 libsystem_pthread.dylib`_pthread_start + 320

Trying to narrow it down:

  1. It can also be reproduced by trying to build the tests with cargo +stage1 build --features neon --release --tests
  2. It also happens trying to build the asmtest.rs example with that switched to neon.
  3. If the modified asmtest.rs is added to a module in the library sources itself forcing monomorphization building the library in release mode fails as well.

With --emit=llvm-ir I can already get a file which reproduces the LLVM codegen crash with llc but it is too big and it might still be bad IR that rustc emits. I will look into it further when I have time.

@HEnquist
Copy link
Author

HEnquist commented Feb 6, 2022

I didn't have any time to continue on this (and I think that I probably know too little about this stuff to be useful anyway). Did anyone else make any progress?

@Amanieu
Copy link
Member

Amanieu commented Feb 21, 2022

Rust recently upgraded to LLVM 14, can you try this on the latest nightly to see if it is still an issue?

@HEnquist
Copy link
Author

I'll try asap and report back!

@HEnquist
Copy link
Author

Rust recently upgraded to LLVM 14, can you try this on the latest nightly to see if it is still an issue?

I just tried this, and unfortunately the newer LLVM doesn't seem to make any difference.

@Nugine
Copy link
Contributor

Nugine commented Nov 23, 2022

I think rustc generates correct instructions.

https://developer.arm.com/architectures/instruction-sets/intrinsics/vld2q_f64

use core::arch::aarch64::*;

#[inline(never)]
pub unsafe fn vld2q_f64_real(p: *const f64) -> float64x2x2_t {
    vld2q_f64(p)
}

#[inline(never)]
pub unsafe fn vld2q_f64_fake(a: *const f64) -> float64x2x2_t {
    let x: [float64x1_t; 4] = core::ptr::read_unaligned(a.cast());
    core::mem::transmute([x[0], x[2], x[1], x[3]])
}
example::vld2q_f64_real:
        ld2     { v0.2d, v1.2d }, [x0]
        stp     q0, q1, [x8]
        ret

example::vld2q_f64_fake:
        ldp     d0, d2, [x0]
        ldp     d1, d3, [x0, #16]
        str     d0, [x8]
        str     d2, [x8, #16]
        str     d1, [x8, #8]
        str     d3, [x8, #24]
        ret

It may be related with a recent issue. The latest nightly has upgraded to LLVM 15.0.4.

@HEnquist
Copy link
Author

I just got back to this after a small break :)
Things are working just fine on recent rustc versions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants