Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crazy idea: Raw(Box/Vec/Deque)Impl<T> traits #63

Open
sollyucko opened this issue Jul 2, 2020 · 3 comments
Open

Crazy idea: Raw(Box/Vec/Deque)Impl<T> traits #63

sollyucko opened this issue Jul 2, 2020 · 3 comments

Comments

@sollyucko
Copy link

sollyucko commented Jul 2, 2020

Rationale

  1. The current allocator APIs and proposals require using much more unsafe code than necessary.
  2. The current APIs do not allow for type-specific allocators (e.g. slab allocators).
  3. The current APIs for re-allocation always copy the entire block of data as-is.
  4. The current APIs use a fixed set of metadata associated with the pointer (the pointer itself + its alignment).

Core traits

These traits do not currently include a generic way to create instances; this is expected to be done through methods on allocators. Additionally, these traits do not control zeroing; this is expected to depend on the type of the allocator. (Are there any cases where creation/resizing is sometimes non-zeroing and sometimes zeroing, for the same value?)

Instances of these traits should free their memory on drop.

/// # Safety
/// - When converted into raw pointers, the references returned by Deref(Mut) must be valid until:
///   - `Drop::drop` is called
pub unsafe trait RawBoxImpl<T: ?Sized>: DerefMut<Target = T> {}

pub unsafe trait RawBoxSizedImpl<T>: RawBoxImpl<T> + Sized {
    type OfMaybeUninit: RawBoxMaybeUninitImpl<T, Init=Self>;
}

pub unsafe trait RawBoxMaybeUninitImpl<T>: RawBoxImpl<MaybeUninit<T>> + Sized {
    type Init: RawBoxSizedImpl<T, OfMaybeUninit=Self>;
    
    fn initialize(self, value: T) -> Self::Init;
}

/// # Safety
/// - When converted into raw pointers, the references returned by Deref(Mut) must be valid until:
///   - `Drop::drop` is called
///   - **`grow` is called**
///   - **`shrink` is called**
pub unsafe trait RawVecImpl<T>: RawBoxImpl<[MaybeUninit<T>]> {
    fn grow(
        &mut self,
        min_new_capacity: usize,
        target_capacity: usize,
        copier: impl FnOnce(&Cell<Self::Target>, &Cell<Self::Target>), // old/src, new/dest; caller must catch and re-throw panics
    ) -> Option<()>;
    fn grow_in_place(
        &mut self,
        min_new_capacity: usize,
        target_capacity: usize,
        copier: impl FnOnce(&Cell<Self::Target>, &Cell<Self::Target>),
    ) -> Option<()>;
    fn shrink(
        &mut self,
        target_capacity: usize,
        copier: impl FnOnce(&Cell<Self::Target>, &Cell<Self::Target>),
    ) -> Option<()>;
    fn shrink_in_place(
        &mut self,
        target_capacity: usize,
        copier: impl FnOnce(&Cell<Self::Target>, &Cell<Self::Target>),
    ) -> Option<()>;
}

/// This has identical methods to RawVecImpl. The difference is in the contract:
/// calls to `*_in_place` may grow/shrink on both ends.
///
/// # Safety
/// - When converted into raw pointers, the references returned by Deref(Mut) must be valid until:
///   - `Drop::drop` is called
///   - `grow` is called
///   - `shrink` is called
///   - **`grow_in_place` is called**
///   - **`shrink_in_place` is called**
pub unsafe trait RawVecDequeImpl<T>: RawBoxImpl<[MaybeUninit<T>]> {
    fn grow(
        &mut self,
        min_new_capacity: usize,
        target_capacity: usize,
        copier: impl FnOnce(&Cell<Self::Target>, &Cell<Self::Target>),
    ) -> Option<()>;
    fn grow_in_place(
        &mut self,
        min_new_capacity: usize,
        target_capacity: usize,
        copier: impl FnOnce(&Cell<Self::Target>, &Cell<Self::Target>),
    ) -> Option<()>;
    fn shrink(
        &mut self,
        target_capacity: usize,
        copier: impl FnOnce(&Cell<Self::Target>, &Cell<Self::Target>),
    ) -> Option<()>;
    fn shrink_in_place(
        &mut self,
        target_capacity: usize,
        copier: impl FnOnce(&Cell<Self::Target>, &Cell<Self::Target>),
    ) -> Option<()>;
}

Additional trait

This is used by types like Rc and Arc.

/// # Safety
/// - `Foo::unfreeze(foo.freeze())` must be equivalent to `foo`.
pub unsafe trait Freeze: Sized {
    type Frozen: Copy + Sized;

    fn freeze(self) -> Self::Frozen;

    /// # Safety
    /// After calling unfreeze, copies of `frozen` must not be used in any way.
    unsafe fn unfreeze(frozen: Self::Frozen) -> Self;
}

#[repr(transparent)]
pub struct FrozenCell<T>(UnsafeCell<T>);

impl<T> FrozenCell<T> {
    pub fn from_mut(t: &mut T) -> &Self {
        // SAFETY:
        // FrozenCell<T> is a repr(C) wrapper around UnsafeCell<T>,
        // and UnsafeCell<T> is a repr(C) wrapper around T.
        unsafe { &*(t as *mut T as *const FrozenCell<T>) }
    }

    /// # Safety
    /// - When calling this function, references resulting from calling `deref`
    ///   on any copy of `self` must not exist.
    /// - After calling this function, `deref` must not be called on any copy
    ///   of `self`.
    #[allow(clippy::mut_from_ref)]
    pub unsafe fn assert_unique(&self) -> &mut T {
        // SAFETY:
        // The contract of this method disallows aliasing.
        &mut *self.0.get()
    }
}

impl<T> Deref for FrozenCell<T> {
    type Target = T;

    fn deref(&self) -> &T {
        // SAFETY:
        // Any number of shared references are okay. `assert_unique` is the only
        // method that produces a `&mut T`, and its contract disallows aliasing.
        unsafe { &*self.0.get() }
    }
}

Usage examples

pub mod boxed {
    use crate::alloc::RawBoxImpl;
    use std::marker::PhantomData;
    use std::mem;
    use std::ops::{Deref, DerefMut};

    pub struct Box<T, Raw>(Raw, PhantomData<T>);

    impl<T, Raw: RawBoxImpl<T>> Box<T, Raw> {
        pub fn new_from_raw(mut raw: Raw, value: T) -> Self {
            *raw = value;
            Self(raw, PhantomData)
        }

        pub fn into_raw(mut b: Self) -> *mut T {
            let result = b.0.deref_mut() as *mut T;
            mem::forget(b);
            result
        }

        pub fn leak<'a>(b: Self) -> &'a mut T
        where
            T: 'a,
        {
            let ptr = Box::into_raw(b);
            // SAFETY:
            // Since `mem::forget` (via `Box::into_raw`) has been called on `b`, and therefore effectively `b.0`,
            // no safe code can ever cause `Drop::drop` to be called on it.
            unsafe { &mut *ptr }
        }
    }

    impl<T, Raw: RawBoxImpl<T>> Deref for Box<T, Raw> {
        type Target = T;

        fn deref(&self) -> &T {
            self.0.deref()
        }
    }

    impl<T, Raw: RawBoxImpl<T>> DerefMut for Box<T, Raw> {
        fn deref_mut(&mut self) -> &mut T {
            self.0.deref_mut()
        }
    }
}

pub mod rc {
    use crate::alloc::{RawBoxMaybeUninitImpl, RawBoxSizedImpl};
    use crate::freeze::Freeze;
    use std::cell::Cell;
    use std::intrinsics::abort;
    use std::marker::PhantomData;
    use std::ops::Deref;
    use std::ptr;

    mod internal {
        // Allow public-in-private due to type bounds
        use std::cell::Cell;

        pub struct RcBox<T: ?Sized> {
            pub(in crate::rc) strong: Cell<usize>,
            pub(in crate::rc) weak: Cell<usize>,
            pub(in crate::rc) value: T,
        }
    }
    use self::internal::RcBox;

    pub struct Rc<T, Raw: Freeze>(Raw::Frozen, PhantomData<T>)
    where
        Raw::Frozen: Deref<Target = RcBox<T>>;

    impl<T, Raw: Freeze + RawBoxSizedImpl<RcBox<T>>> Rc<T, Raw>
    where
        Raw::Frozen: Deref<Target = RcBox<T>>,
    {
        pub fn new_init_from_raw(raw_maybe_uninit: Raw::OfMaybeUninit, value: T) -> Rc<T, Raw> {
            let raw = raw_maybe_uninit.initialize(RcBox {
                strong: Cell::new(1),
                weak: Cell::new(1),
                value,
            });
            Self(raw.freeze(), PhantomData)
        }
    }

    impl<T, Raw: Freeze> Rc<T, Raw>
    where
        Raw::Frozen: Deref<Target = RcBox<T>>,
    {
        #[inline]
        pub fn strong(&self) -> usize {
            self.0.strong.get()
        }

        #[inline]
        fn inc_strong(&self) {
            let strong = self.strong();
            if strong == 0 || strong == usize::MAX {
                abort();
            }
            self.0.strong.set(strong + 1);
        }

        #[inline]
        fn dec_strong(&self) {
            self.0.strong.set(self.strong() - 1);
        }

        #[inline]
        pub fn weak(&self) -> usize {
            self.0.weak.get()
        }

        #[allow(dead_code)]
        #[inline]
        fn inc_weak(&self) {
            let weak = self.weak();
            if weak == 0 || weak == usize::MAX {
                abort();
            }
            self.0.weak.set(weak + 1);
        }

        #[inline]
        fn dec_weak(&self) {
            self.0.weak.set(self.weak() - 1);
        }
    }

    impl<T, Raw: Freeze> Clone for Rc<T, Raw>
    where
        Raw::Frozen: Deref<Target = RcBox<T>>,
    {
        fn clone(&self) -> Self {
            self.inc_strong();
            Self(self.0, PhantomData)
        }
    }

    impl<T, Raw: Freeze> Drop for Rc<T, Raw>
    where
        Raw::Frozen: Deref<Target = RcBox<T>>,
    {
        fn drop(&mut self) {
            self.dec_strong();
            if self.strong() == 0 {
                // SAFETY:
                // Since the *strong* reference count is now zero, we know that there must not
                // be any other references to the *value*
                unsafe {
                    ptr::drop_in_place(&self.0.deref().value as *const T as *mut T);
                }
                self.dec_weak();
                if self.weak() == 0 {
                    // SAFETY:
                    // Since the *weak* reference count is now zero, we know that there must not
                    // be any other references to the *allocation*
                    unsafe { drop(Raw::unfreeze(self.0)) }
                }
            }
        }
    }
}

pub mod vec {
    use crate::alloc::RawVecImpl;
    use std::marker::PhantomData;
    use std::mem::MaybeUninit;
    use std::ptr;

    pub struct Vec<T, Raw: RawVecImpl<T>> {
        buf: Raw,
        len: usize,
        phantom: PhantomData<T>,
    }

    impl<T, Raw: RawVecImpl<T>> Vec<T, Raw> {
        pub fn len(&self) -> usize {
            self.len
        }

        pub fn is_empty(&self) -> bool {
            self.len == 0
        }

        pub fn reserve(&mut self, additional: usize) {
            let len = self.len;
            self.buf
                .grow(self.len + additional, self.len * 2, |old_cell, new_cell| {
                    let old = old_cell.as_slice_of_cells();
                    let new = new_cell.as_slice_of_cells();
                    debug_assert!(old.len() >= len);
                    debug_assert!(new.len() >= len + additional);
                    if old.is_empty() {
                        return;
                    }
                    let old_ptr = old_cell.as_ptr(): *mut [MaybeUninit<T>] as *mut T as *const T;
                    let new_ptr = new_cell.as_ptr(): *mut [MaybeUninit<T>] as *mut T;
                    // SAFETY:
                    // - Both pointers were derived from references.
                    // - The length copied is at most the length of the slice.
                    // - `old` will no longer be used after this closure returns.
                    unsafe {
                        ptr::copy(old_ptr, new_ptr, len);
                    }
                });
        }
    }
}

See also

@sollyucko

This comment has been minimized.

@sollyucko
Copy link
Author

sollyucko commented Jul 11, 2020

I was able to implement this API for a typestate-ish-based safe upwards bump allocator:

pub struct UpAlloc<'a, T>(&'a mut [MaybeUninit<T>]);

impl<'a, T> UpAlloc<'a, T> {
    pub fn new(buf: &'a mut [MaybeUninit<T>]) -> Self {
        UpAlloc(buf)
    }

    pub fn alloc_uninit(&'a mut self) -> Option<(UpBox<'a, MaybeUninit<T>>, UpAlloc<'a, T>)> {
        let (start, remainder) = self.0.split_first_mut()?;
        Some((UpBox(start), UpAlloc(remainder)))
    }

    pub fn alloc_init(&'a mut self, value: T) -> Option<(UpBox<'a, T>, UpAlloc<'a, T>)> {
        let (start, remainder) = self.0.split_first_mut()?;
        let start = start.write(value);
        Some((UpBox(start), UpAlloc(remainder)))
    }

    pub fn alloc_slice_uninit(
        &'a mut self,
        min_len: usize,
        target_len: usize,
    ) -> Option<(UpBox<'a, [MaybeUninit<T>]>, UpAlloc<'a, T>)> {
        if min_len > self.0.len() {
            None
        } else if target_len > self.0.len() {
            Some((UpBox(self.0), UpAlloc(&mut [])))
        } else {
            let (start, remainder) = self.0.split_at_mut(target_len);
            Some((UpBox(start), UpAlloc(remainder)))
        }
    }
}

pub struct UpBox<'a, T: ?Sized>(&'a mut T);

impl<'a, T: ?Sized> Deref for UpBox<'a, T> {
    type Target = T;

    fn deref(&self) -> &T {
        self.0
    }
}

impl<'a, T: ?Sized> DerefMut for UpBox<'a, T> {
    fn deref_mut(&mut self) -> &mut T {
        self.0
    }
}

// UNSAFE:
// - `self.0` is a reference
// - `self.0` is never modified
unsafe impl<'a, T> RawBoxImpl<T> for UpBox<'a, T> {}

unsafe impl<'a, T> RawBoxSizedImpl<T> for UpBox<'a, T> {
    type OfMaybeUninit = UpBox<'a, MaybeUninit<T>>;
}

unsafe impl<'a, T> RawBoxMaybeUninitImpl<T> for UpBox<'a, MaybeUninit<T>> {
    type Init = UpBox<'a, T>;
    
    fn initialize(self, value: T) -> UpBox<'a, T> {
        UpBox(self.0.write(value))
    }
}

unsafe impl<'a, T> Freeze for UpBox<'a, T> {
    type Frozen = &'a T;

    fn freeze(self) -> &'a T {
        self.0
    }
    
    unsafe fn unfreeze(frozen: &'a T) -> Self {
        Self(&mut *(frozen as *const T as *mut T))
    }
}

#[cfg(test)]
mod tests {
    #[test]
    fn test_rc() {
        use super::{UpAlloc, UpBox};
        use crate::rc::Rc;
        use std::mem::MaybeUninit;
        
        let mut buf: [_; 1024] = MaybeUninit::uninit_array();
        let mut alloc = UpAlloc(&mut buf);
        let (x, _alloc) = alloc.alloc_uninit().unwrap();
        
        Rc::<i32, UpBox<_>>::new_init_from_raw(x, 123);
    }
}

Playground: https://play.rust-lang.org/?version=nightly&mode=debug&edition=2018&gist=28dcb72cf6b64421b7302b1ed0183448

@Wodann
Copy link

Wodann commented Aug 11, 2020

I've been waiting for some other people in the WG to respond to this issue, because I didn't fully comprehend the need for a solution to the described problem.

As no one has responded yet, I'll give my two cents. As far as I understand, the goal of the AllocRef API is to provide a low-level allocator API that operates on bytes. External crates are free to implement type-based APIs, but it is unnecessary for it to be part of std.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants