Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash when using SpiDevice::write (Because of a hidden alignment requirement) #295

Closed
mlthlschr opened this issue Dec 10, 2022 · 10 comments · Fixed by #395
Closed

Crash when using SpiDevice::write (Because of a hidden alignment requirement) #295

mlthlschr opened this issue Dec 10, 2022 · 10 comments · Fixed by #395
Labels
bug Something isn't working

Comments

@mlthlschr
Copy link

mlthlschr commented Dec 10, 2022

Hi, I am trying to write a small driver for an SPI device with registers. For that, I have the following function to write the instruction and then read the data:

pub fn read_registers<Spi>(
    spi: &mut Spi,
    first_register: u8,
    data: &mut [u8],
) -> Result<(), Spi::Error>
where
    Spi: SpiDevice,
    Spi::Bus: SpiBus,
{
    let cmd = 0x80 | first_register;
    spi.transaction(|b| {
       // this call crashes
        b.write(&[cmd])?;
        b.read(&mut data[..])?;
        Ok(())
    })?;

    Ok(())
}

However, it crashes on an ESP32-C3 with

PanicInfo {
    payload: Any { .. },
    message: Some(
        unsafe precondition(s) violated: ptr::copy_nonoverlapping requires that both pointer arguments are aligned and non-null and the specified memory ranges do not overlap,
    ),
    location: Location {
        file: "/home/malte/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/panicking.rs",
        line: 90,
        col: 58,
    },
    can_unwind: false,
}
 
Backtrace:
 
0x420040d8
0x420040d8 - core::intrinsics::copy_nonoverlapping::runtime
    at /home/malte/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/intrinsics.rs:2280
0x42005178
0x42005178 - esp_hal_common::spi::Instance::write_bytes
    at /home/malte/.cargo/registry/src/github.com-1ecc6299db9ec823/esp-hal-common-0.3.0/src/spi.rs:1421
0x420080d8
0x420080d8 - esp_hal_common::spi::ehal1::<impl embedded_hal::spi::SpiBusWrite for esp_hal_common::spi::Spi<T>>::write
    at /home/malte/.cargo/registry/src/github.com-1ecc6299db9ec823/esp-hal-common-0.3.0/src/spi.rs:731
0x42006a1a
0x42006a1a - hal_driver_util::spi::read_registers::{{closure}}
    at /home/malte/code/hardware/ESP-rs/hal-driver-util/src/spi.rs:27
0x420049fa
0x420049fa - <esp_hal_common::spi::ehal1::SpiBusDevice<I,CS> as embedded_hal::spi::SpiDevice>::transaction::{{closure}}
    at /home/malte/.cargo/registry/src/github.com-1ecc6299db9ec823/esp-hal-common-0.3.0/src/spi.rs:907
0x42007c8a
0x42007c8a - critical_section::with
    at /home/malte/.cargo/registry/src/github.com-1ecc6299db9ec823/critical-section-1.1.1/src/lib.rs:226
0x42004934
0x42004934 - <esp_hal_common::spi::ehal1::SpiBusDevice<I,CS> as embedded_hal::spi::SpiDevice>::transaction
    at /home/malte/.cargo/registry/src/github.com-1ecc6299db9ec823/esp-hal-common-0.3.0/src/spi.rs:900
0x42006984
0x42006984 - hal_driver_util::spi::read_registers
    at /home/malte/code/hardware/ESP-rs/hal-driver-util/src/spi.rs:21

Do you know what is happening here? I am using not a line of unsafe code (in the code written by me) before calling this method.

I tried to read the register at a different place in the lib using spi.transfer() and there it works without problems.

Another observation: when I change the code inside the closure to

let cmd = [cmd;2];
b.write(&cmd);

it does not crash.
Using let cmd = [cmd;1]; however also crashes.

Update 1: this observation is not always true. Sometimes it does not work...

Update 2: this also happens when using transfer_in_place()

@mlthlschr mlthlschr changed the title Crash when using SpiDevice::read within a SpiDevice::transfer Crash when using SpiDevice::read within a SpiDevice::transaction Dec 10, 2022
@mlthlschr
Copy link
Author

Quite interesting problem, I learned something about alignment. Seems like datas alignment is not compatible with the one required for u32s (to which the slice's pointer is casted in esp_hal_common::spi::Instance::write_bytes).

For now I did a workaround with a helper struct #[repr(align(32))] struct AlignedBuffer([u8;64]) which aligns correctly. For the meantime it is OK, but would be cool to not depend on that.

@bjoernQ
Copy link
Contributor

bjoernQ commented Dec 12, 2022

Thanks for bringing this up! I also think this hidden alignment requirement is very unfortunate and should get addressed

@bjoernQ bjoernQ added the bug Something isn't working label Dec 12, 2022
@bjoernQ bjoernQ changed the title Crash when using SpiDevice::read within a SpiDevice::transaction Crash when using SpiDevice::read within a SpiDevice::transaction (Because of a hidden alignment requirement) Dec 12, 2022
@jessebraham jessebraham moved this to Todo in esp-rs Dec 12, 2022
@mlthlschr mlthlschr changed the title Crash when using SpiDevice::read within a SpiDevice::transaction (Because of a hidden alignment requirement) Crash when using SpiDevice::write (Because of a hidden alignment requirement) Dec 12, 2022
har7an added a commit to har7an/esp-hal that referenced this issue Dec 31, 2022
when transferring data on esp32c3 via SPI. The alignment error
originates in the call to `core::ptr::copy_nonoverlapping`, which
"requires that both pointer arguments are aligned and non-null and the
specified memory ranges do not overlap".

Closes esp-rs#295
@i404788
Copy link
Contributor

i404788 commented Jan 11, 2023

TL;DR: core::intrinsics::volatile_copy_nonoverlapping_memory::<u32> (and all it's variants) are broken when len<16 or if any parameter is not aligned (I think). *mut u32::volatile_write does work (afaik) but requires additional handling since you are always writing a full u32.


Hey was just reading through the code to figure out if QuadSPI/OctalSPI was already supported and found this issue through

// FIXME: Using any other transfer length **does not work**. I don't understand

I encountered the same issue when implementing SHA, there I implemented the AlignmentHelper

impl AlignmentHelper {
which is effectively just a u32 write buffer. I tried many workarounds (copy,volatile_copy) but the only thing that worked reliably was to use a *mut u32::volative_write in a loop like in
for (i, v) in to_write.chunks_exact(ALIGN_SIZE).enumerate() {
dst.add(i)
.write_volatile(U32_FROM_BYTES(v.try_into().unwrap()).to_be());
}
. What it seemed like from the assembly is that the small copies are also optimized out (len<16 => no memcpy) and then break.

@bjoernQ
Copy link
Contributor

bjoernQ commented Jan 30, 2023

I think that loop was what we initially had in the SPI driver - only downside IMHO is that in opt-level 0 it might be a bit slow but with opt-level >0 it's totally fine

@MabezDev @jessebraham any thoughts on this? I'd say we should replace volatile_copy_nonoverlapping_memory in all those places

@jessebraham
Copy link
Member

I think it's probably a good idea to make those changes, yeah. The only downside you listed is, IMO, not even really much of an issue in practice.

@MabezDev
Copy link
Member

Not really expressing much of an opinion, but in esp-idf, they return an error/panic if the buffer is not correctly aligned and doesn't have a length that is a multiple of a word. This pushes the requirements to the user of the API in a clear, albeit slightly annoying way (runtime error).

@dimpolo
Copy link
Contributor

dimpolo commented Jan 31, 2023

I actually encountered a miscompilation relating to this.

for byte in frame {
    let b = &[byte.reverse_bits()];
    spi.write(b)?;
    // core::hint::black_box(b);
}

Without the black box, only 0xFFs get sent.
After adding the black box the correct bytes get sent.

As far as I understand, there's actually three problems with the current code:

  • the mentioned alignment requirement
  • not using volatile
  • reading past the end of chunk

let fifo_ptr = reg_block.w0.as_ptr();
unsafe {
// It seems that `copy_nonoverlapping` is significantly faster than regular
// `copy`, by about 20%... ?
core::ptr::copy_nonoverlapping::<u32>(
chunk.as_ptr() as *const u32,
fifo_ptr as *mut u32,
// FIXME: Using any other transfer length **does not work**. I don't understand
// why.
FIFO_SIZE / 4,
);
}

@bjoernQ
Copy link
Contributor

bjoernQ commented Feb 1, 2023

Thanks for the insights. This definitely needs to get fixed (for all the reasons you mentioned)

Regarding the mis-compilation: on which targets do you see this?

@bjoernQ
Copy link
Contributor

bjoernQ commented Feb 1, 2023

Not really expressing much of an opinion, but in esp-idf, they return an error/panic if the buffer is not correctly aligned and doesn't have a length that is a multiple of a word. This pushes the requirements to the user of the API in a clear, albeit slightly annoying way (runtime error).

I think panicking / returning an error wouldn't be a really good option since those requirements are not requirement by the EH traits and would potentially prevent users from using existing drivers 🤷‍♂️

For our own APIs however, that might be okay if we document that clearly

@dimpolo
Copy link
Contributor

dimpolo commented Feb 1, 2023

Regarding the mis-compilation: on which targets do you see this?

.cargo/config.toml:

[build]
rustflags = [
  "-C", "link-arg=-nostartfiles",
  "-C", "link-arg=-Wl,-Tlinkall.x",
]
target = "xtensa-esp32s3-none-elf"

[unstable]
build-std = ["core"]

In my case the miscompilation went away after changing the code to:

for i in 0..FIFO_SIZE / 4 {
    unsafe {
        core::ptr::write_volatile(
            (fifo_ptr as *mut u32).add(i),
            *(chunk.as_ptr() as *const u32).add(i),
        );
    }
}

but this does not address the alignment or out of bounds read problem

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

6 participants