Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GlobalDlmalloc reentrancy on panics #43

Open
hanna-kruppe opened this issue Oct 1, 2024 · 2 comments
Open

GlobalDlmalloc reentrancy on panics #43

hanna-kruppe opened this issue Oct 1, 2024 · 2 comments

Comments

@hanna-kruppe
Copy link
Contributor

While looking through the global allocator implementations, I've noticed that the code doesn't explicitly guard against reentrancy when acquiring the lock. If the allocator code panics for any reasons (most likely debug assertions) then the #[panic_handler] or panic hook will run and potentially allocate recursively. This would be problematic because:

  1. It creates a second &mut Dlmalloc to the global DLMALLOC instance while another mutable reference to it exists in another stack frame. Although you could argue that it's fine because execution will be aborted so the first one is no longer live.
  2. More importantly, the allocator logic most likely isn't written to support reentrancy, and even if it was, the panic happened because some invariant was broken so further allocations can't be serviced reliably.

Looking through the existing platform-specific implementations of the global lock:

  • On Unix, a normal ("fast") pthread mutex is used. According to the man page this will deadlock reliably if the same thread tries to acquire the mutex again, which is not ideal but safe.
  • On windows, an SRWLOCK is used in exclusive mode, which doesn't support recursive acquisition but the documentation implies that it will reliably deadlock in exclusive mode (involving shared mode is a crapshoot). Again, not great but safe.
  • Xous doesn't implement this lock because it doesn't have global allocator support.
  • On wasm, there's no lock at all.

The following program, compiled to wasm32-unknown-unknown (Rust 1.81, debug mode, crate-type cdylib), demonstrates that reentrancy really happens. The bogus dealloc call causes an assertion failure and then the use of format! in the panic hook allocates a new string.

use std::alloc::{self, Layout};

extern "C" {
    fn exfiltrate_str(data: *const u8, len: usize);
}

#[no_mangle]
pub extern fn entry_point() {
    std::panic::set_hook(Box::new(|info| {
        let msg = format!("panic at {:?}", info.location());
        unsafe { exfiltrate_str(msg.as_ptr(), msg.len()); }
    }));
    let layout = Layout::from_size_align(32, 8).unwrap();
    unsafe {
        alloc::dealloc(64 as *mut u8, layout);
    }
}

I cobbled together a wasmtime embedder that provides a definition for exfiltrate_str and the output is: panic at Some(Location { file: "/rust/deps/dlmalloc-0.2.6/src/dlmalloc.rs", line: 1192, col: 9 }).

How important is this? Probably not very much, at least as long as it's constrained to wasm. Triggering an assertion in the allocator almost certainly means you already have UB somewhere. If the reentrancy causes a crash during execution of the panic hook/handler instead of right afterwards, at worst it makes debugging a bit harder. But since I already dug into it, I figured I might as well report it.


Finally, after writing this up, I realized that you can cause similar issues on non-wasm targets by compiling with panic=unwind and catching the panic. While this could cause more damage, properly supporting other targets explicitly isn't a goal of the crate so it's less interesting.

@alexcrichton
Copy link
Owner

Thanks for writing this up! In lieu of a larger rewrite to base everything on &self instead of &mut self, which in retrospect is probably a better idea, would you agree that a small reentrancy protection might make sense? For example and AtomicBool for something like whether the allocator has been entered, and that's unconditionally checked on all platforms to abort wasm on reentrancy for example?

@hanna-kruppe
Copy link
Contributor Author

Sure, that would fix the problem on wasm and could provide a nicer error message on other platforms. I only briefly thought about possible remedies when I was writing this up but it's also the first thing that came to my mind.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants