-
Notifications
You must be signed in to change notification settings - Fork 13.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
64-bit Array Pointer Misalignment on riscv32i #86693
Comments
To add more information, I've simplified the program to this: #![cfg_attr(target_os = "none", no_std)]
#![cfg_attr(target_os = "none", no_main)]
#[derive(Copy, Clone)]
pub struct Scalar {
pub bytes: [u8; 32],
}
impl Scalar {
pub fn non_adjacent_form(&self, w: usize) -> [i8; 256] {
let mut naf = [0i8; 256];
let mut x_u64 = [0u64; 5];
let x_u64_ptr_first = x_u64.as_ptr() as usize;
println!("x_u64: {:x?}", x_u64);
println!("&x_u64: 0x{:08x}", x_u64_ptr_first);
println!(
"Address is aligned? {} (remainder: {})",
x_u64_ptr_first & 7 == 0,
x_u64_ptr_first & 7
);
let x_u64_ptr_second = x_u64.as_ptr() as usize;
assert_eq!(0, (x_u64.as_ptr() as usize) & 7);
println!(
"Passed assertion that 0 == {} {:08x} (or {} {:08x})",
x_u64_ptr_first & 7,
x_u64_ptr_first,
x_u64_ptr_second & 7,
x_u64_ptr_second
);
naf
}
}
fn main() {
let scalar = Scalar {
bytes: [
189, 59, 214, 8, 77, 86, 240, 50, 111, 170, 86, 37, 124, 154, 209, 79, 102, 72, 93, 53,
130, 157, 102, 200, 60, 240, 215, 104, 246, 58, 214, 13,
],
};
let a_naf = scalar.non_adjacent_form(5);
} When I run this on a desktop Rust, it works just fine. However, building for an embedded RISC-V target, running this program produces the following output:
There is no unsafe code here, yet the pointer appears to be changing. |
What should the alignment be of a I think the strangest thing here is that |
Updated the bug title to reflect the further root cause found. In summary:
Whether this is a bug in rustc or in llvm, I do not know, but hopefully this is a compact summary that can help those more knowledgeable than us to the correct maintainer to request a fix. It also occurred to me that there's a possibility that maybe it's the responsibility, somehow, of the OS to intervene with stack alignments, but I can't think of a mechanism for guaranteeing that (should be at the linker/compiler level). But fwiw this is all running on our own home-rolled OS, but it's basically Rust on raw iron, and I can't think of a mechanism we could invoke to help guarantee such 64-bit alignments. Everything starts aligned to 4k-pages, anyways... |
As mentioned by @bunnie , the root issue appears to be that Here's a minimum reproduction: pub fn alignment_check() {
println!("Alignment of u64: {}", core::mem::align_of::<u64>());
let single_u64 = 443u64;
println!("Address of single_u64: {:08x}", &single_u64 as *const u64 as usize);
} On
So it should be aligned, and some optimizations are turned on assuming that it is aligned, however it isn't actually aligned. |
Summarized as a haiku: alignment is 8 |
Note that a workaround is to use a newtype. For example: #[repr(align(32))]
struct AlignedU64Slice<const N: usize>([u64; N]); Then you can refer to it using something like: let my_slice = AlignedU64Slice([0u64; 5]); I'm not clear why the align needs to be so large, but anything else causes it to not be properly aligned. It could be a misunderstanding of what |
thanks @xobs for the work-around!
For 32-bit targets on RISC-V, we have encountered a problem in the non_adjacent_form() routine inside Scalar. To the best we can tell, there is a compiler bug which is causing 64-bit types on 32-bit platforms to not be properly aligned (see issue rust-lang/rust#86693). This bug results in the lower 32 bits of a 64-bit number being replicated into the top 32 bits, as a result of pointer arithmetic optimizations that rely on correct alignment of the data types. So far, we have only seen this problem inside one function that affects ed25519 verifications, but it could be in other libraries we haven't used yet. Thanks to @xobs we have a work-around, which is to wrap the `[u64; 5]` type inside an `AlignedU64Slice` `newtype` which is annotated with a `#[repr(align(32))]`. In our tests this causes the ed25519 signatures tests to pass on our platform. Unfortunately, we are unable to get the bug to express on x86 using a u32 backend; the bug seems to be specific to RISC-V. However, I think this patch is light-fingered enough that it might be worth considering absorbing into the crate. Based on the age of similar bugs filed against the Rust project, it may take some months or years even before this gets properly addressed...
Update: the issue was due to The current ABI assumes the stack is aligned to 16-bytes. Unfortunately on the target system, the stack was only 4-byte aligned. As a result, some optimizations were triggered that were causing this unfortunate miscompilation. Aligning |
turns out we just needed to align $sp to 16-byte boundaries This reverts commit 279bbc5.
Yep, looks like the issue is due to a mis-alignment of https://github.com/riscv/riscv-eabi-spec/blob/master/EABI.adoc#4-eabi-stack-alignment Thanks for tracing this down @xobs! I'll close this for now, seems not to be an issue with Rust/llvm and actually an issue with our runtime. |
I think this is a bug, which is exhibited in a riscv32imac, but not an x86 target.
I tried this code (see it on goldbolt / playground ); the version below has the
log
statements in there so you can line up the code against log data I provide later in this bug:I expected to see this happen:
This is a snippet from the
curve25519-dalek
crate'sscalar.rs
module, which is used ined25519
signatures. Fortunately, there are well-known test vectors for curve25519, so I'm able to give you expected inputs and outputs.Instead, this happened:
Basically, on the first iteration of the loop,
bit_buf
should be loaded with the value ofx_u64[0] >> bit_idx
, and becausebit_idx
is 0,bit_buf
should be exactlyx_u64[0]
.Instead, I find that the value of
x_u64[0]
is 32f0564d_08d63bbd, and the value ofbit_buf
is 8d63bbd_08d63bbd. The key symptom is that instead of representing a whole 64-bit value,bit_buf
seems to be just the lower 32 bits of the 64-bit value, replicated up to the upper 32 bits.This problem continues in later iterations of the loop, with the upper 32 bits taking a copy of the lower 32 bits.
Meta
rustc --version --verbose
:I feel like there should be some mention of the RISCV target toolchain version as well, but the bug template does not include an instruction on how to extract that. To wit, I was able to reproduce the bug using a Linux-based toolchain running Renode, as well as using a Windows-based toolchain running on live hardware (using the VexRISCV implementation on an FPGA).
Unfortunately, because our embedded target does not have a functional BACKTRACE facility, I'm unable to include a backtrace, but hopefully the print-logs above are helpful enough.
I'll continue to poke at this some and if I can find a simpler test case, I'll add a follow-up note here.
The text was updated successfully, but these errors were encountered: