refactor to avoid UB references #1044

spacejam · 2020-04-24T20:03:06Z

No description provided.

spacejam · 2020-04-25T07:44:55Z

cc @davidbarsky this is turning out to be a little more extensive than anticipated 😅

spacejam · 2020-04-25T09:17:04Z

Taking a step back, and scanning the reference and pointer aliasing rules, I don't THINK it's UB, but I can add a much simpler UnsafeCell wrapper around AlignedBuf to feel less weird about it.

The original concern, "going from & to &mut is always UB" does not mean that it's UB to access a raw pointer from an immutable object, and turn it into a mutable slice, so long as the rules are followed and never allow a mutable reference to concurrently exist for overlapping regions. This guarantee is enforced by the way that log reservation allocates non-overlapping sections of the IO buffer to specific threads, which use something similar to Arc semantics for then releasing them back to the log writer once all writers are complete.

Here's an extracted snippet which passes miri with no issues:

use std::{
    alloc::{alloc, dealloc, Layout},
    sync::Arc,
};

struct AlignedBuf(*mut u8, usize);

impl AlignedBuf {
    fn new(len: usize) -> AlignedBuf {
        let layout = Layout::from_size_align(len, 8192).unwrap();
        let ptr = unsafe { alloc(layout) };

        assert!(!ptr.is_null(), "failed to allocate critical IO buffer");

        AlignedBuf(ptr, len)
    }
}

impl Drop for AlignedBuf {
    fn drop(&mut self) {
        let layout = Layout::from_size_align(self.1, 8192).unwrap();
        unsafe {
            dealloc(self.0, layout);
        }
    }
}

pub(crate) struct IoBuf {
    buf: Arc<AlignedBuf>,
}

impl IoBuf {
    pub(crate) fn get_mut_range(
        &self,
        at: usize,
        len: usize,
    ) -> &'static mut [u8] {
        assert!(self.buf.1 >= at + len);
        unsafe { std::slice::from_raw_parts_mut(self.buf.0.add(at), len) }
    }
}

fn main() {
    let iobuf = IoBuf {
        buf: Arc::new(AlignedBuf::new(16)),
    };

    let mut_range = iobuf.get_mut_range(4, 2);

    // If you comment out either of the below
    // initializing writes, miri will error due
    // to use of uninitialized memory in the
    // println below.
    mut_range[0] = 8;
    mut_range[1] = 9;

    println!("hello {:?} :)", mut_range);
}

davidbarsky · 2020-04-25T19:09:02Z

Thanks for the update! I hope that I'm wrong with what I wrote in my report and I'm glad Miri seems to agree. With that said, do you think it makes sense to setup some stateful/model-based tests for IoBuf that run under Miri this could be safe against further refactoring or if Miri's model of sound code changes!

Two caveats:

I haven't dug through https://github.com/rust-lang/unsafe-code-guidelines/ closely enough to see if this specific case might be made unsound, but I don't think it will be.
It looks like Miri support #937 will introduce the sorts of tests I asked about above.

Anyways, thanks for digging into this and I'm glad I was wrong!

spacejam · 2020-04-26T12:10:40Z

@davidbarsky I'm glad you brought it up! I'm also going to better document that method which seemed suspicious to you with a reference to this issue. In general, please feel free to loudly flag anything that feels like it may be broken. sled relies on a bit of unsafe behavior due to the design constraints around non-blocking reads and writes, and it's important that we are responsible about testing and checking our assumptions to reduce the risks inherent in such a design. We already run address sanitizer, thread sanitizer, and leak sanitizer on every PR with a concurrent workload, and this has been quite helpful in finding many would-be memory bugs in the past, but it does not catch the use of uninitialized memory or possible unsoundness issues like what we're discussing here. @divergentdave's efforts with #937 have already caught a bug where we were using uninitialized memory sometimes (#1040) and I'm really excited about all of the work that he's been doing around this.

sled is still a beta database, but I am hopeful that over time our continued efforts around safety and reliability will make sled a great choice for use in correctness-critical systems. My (fairly conservative) expected date for sled 1.0 is January 2021. The final stretch to get there involves more formally specifying our correctness requirements and the causal hierarchy of control that must be applied to enforce our emergent safety properties, and applying rigorous model checking that exercises each of these control points. I'm calling this effort SLEDSAFE, and I'm really excited that sled is at this point after over 4 years of engineering effort.

Please yell loudly if you ever suspect that sled might not be living up to these operational requirements:

linearizable single-key operations
serializable transactions
space amplification of 3x for single-threaded workloads

Thanks again for bringing this up :)

vnrst · 2022-11-09T17:06:54Z

I have another issue with this function - it returns a 'static bounded slice from a &self argument. Wouldn't it be possible to return a longer lived slice from a shorter lived IoBuf object? The following example fails Miri.

use std::{
    alloc::{alloc, dealloc, Layout},
    sync::Arc,
    cell::UnsafeCell
};

struct AlignedBuf(*mut u8, usize);

impl AlignedBuf {
    fn new(len: usize) -> AlignedBuf {
        let layout = Layout::from_size_align(len, 8192).unwrap();
        let ptr = unsafe { alloc(layout) };

        assert!(!ptr.is_null(), "failed to allocate critical IO buffer");

        AlignedBuf(ptr, len)
    }
}

impl Drop for AlignedBuf {
    fn drop(&mut self) {
        let layout = Layout::from_size_align(self.1, 8192).unwrap();
        unsafe {
            dealloc(self.0, layout);
        }
    }
}

pub(crate) struct IoBuf {
    buf: Arc<UnsafeCell<AlignedBuf>>,
    base: usize,
}

impl IoBuf {
    pub(crate) fn get_mut_range(
        &self,
        at: usize,
        len: usize,
    ) -> &'static mut [u8] {
        let buf_ptr = self.buf.get();

        unsafe {
            assert!((*buf_ptr).1 >= at + len);
            std::slice::from_raw_parts_mut(
                (*buf_ptr).0.add(self.base + at),
                len,
            )
        }
    }

}

fn main() {
    let mut mut_range;
    {
        let iobuf = IoBuf {
            buf: Arc::new(UnsafeCell::new(AlignedBuf::new(16))),
            base: 0
        };

        mut_range = iobuf.get_mut_range(4, 2);
        mut_range[0] = 8;
        mut_range[1] = 9;
    }
    // Use after free
    println!("hello {:?} :)", mut_range);
}

spacejam added 2 commits April 25, 2020 11:18

Avoid divide by zero in metrics code

fe5d950

Use UnsafeCell for AlignedBuf

f0c8d2b

spacejam force-pushed the tyler_avoid_treating_raw_buffer_as_ref branch from b3fa805 to f0c8d2b Compare April 25, 2020 10:39

Update comment documenting safety requirements for get_mut_range

380a517

spacejam merged commit 1e43ba4 into master Apr 26, 2020

spacejam deleted the tyler_avoid_treating_raw_buffer_as_ref branch April 26, 2020 12:51

vikramnitin9 mentioned this pull request Mar 15, 2023

IoBuf::get_mut_range unsoundly extends arbitrary lifetime to 'static #1442

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor to avoid UB references #1044

refactor to avoid UB references #1044

spacejam commented Apr 24, 2020

spacejam commented Apr 25, 2020

spacejam commented Apr 25, 2020

davidbarsky commented Apr 25, 2020

spacejam commented Apr 26, 2020

vnrst commented Nov 9, 2022

refactor to avoid UB references #1044

refactor to avoid UB references #1044

Conversation

spacejam commented Apr 24, 2020

spacejam commented Apr 25, 2020

spacejam commented Apr 25, 2020

davidbarsky commented Apr 25, 2020

spacejam commented Apr 26, 2020

vnrst commented Nov 9, 2022