Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More docs for ZeroMap and add Vec<u8> as supported type #1057

Merged
merged 2 commits into from
Sep 22, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 8 additions & 6 deletions utils/zerovec/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,14 +5,16 @@ Zero-copy vector abstractions over byte arrays.
`zerovec` enable vectors of multibyte types to be backed by a byte array, abstracting away
issues including memory alignment and endianness.

This crate has two main types:
This crate has three main types:

- `ZeroVec<T>` for fixed-width types like `u32`
- `VarZeroVec<T>` for variable-width types like `str`
- [`ZeroVec<T>`](ZeroVec) for fixed-width types like `u32`
- [`VarZeroVec<T>`](VarZeroVec) for variable-width types like `str`
- [`ZeroMap<K, V>`](ZeroMap) to map from `K` to `V`

Both are intended as drop-in replacements for `Vec<T>` in Serde structs serialized with a
format supporting a borrowed byte buffer, like Bincode. Clients upgrading from Vec to ZeroVec
or VarZeroVec benefit from zero heap allocations when deserializing read-only data.
The first two are intended as drop-in replacements for `Vec<T>` in Serde structs serialized
with a format supporting a borrowed byte buffer, like Bincode. The third is indended as a
replacement for `HashMap` or `LiteMap`. Clients upgrading to `ZeroVec`, `VarZeroVec`, or
`ZeroMap` benefit from zero heap allocations when deserializing read-only data.

This crate has two optional features: `serde` and `yoke`. `serde` allows serializing and deserializing
`zerovec`'s abstractions via [`serde`](https://docs.rs/serde), and `yoke` enables implementations of `Yokeable`
Expand Down
14 changes: 8 additions & 6 deletions utils/zerovec/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -7,14 +7,16 @@
//! `zerovec` enable vectors of multibyte types to be backed by a byte array, abstracting away
//! issues including memory alignment and endianness.
//!
//! This crate has two main types:
//! This crate has three main types:
//!
//! - `ZeroVec<T>` for fixed-width types like `u32`
//! - `VarZeroVec<T>` for variable-width types like `str`
//! - [`ZeroVec<T>`](ZeroVec) for fixed-width types like `u32`
//! - [`VarZeroVec<T>`](VarZeroVec) for variable-width types like `str`
//! - [`ZeroMap<K, V>`](ZeroMap) to map from `K` to `V`
//!
//! Both are intended as drop-in replacements for `Vec<T>` in Serde structs serialized with a
//! format supporting a borrowed byte buffer, like Bincode. Clients upgrading from Vec to ZeroVec
//! or VarZeroVec benefit from zero heap allocations when deserializing read-only data.
//! The first two are intended as drop-in replacements for `Vec<T>` in Serde structs serialized
//! with a format supporting a borrowed byte buffer, like Bincode. The third is indended as a
//! replacement for `HashMap` or `LiteMap`. Clients upgrading to `ZeroVec`, `VarZeroVec`, or
//! `ZeroMap` benefit from zero heap allocations when deserializing read-only data.
//!
//! This crate has two optional features: `serde` and `yoke`. `serde` allows serializing and deserializing
//! `zerovec`'s abstractions via [`serde`](https://docs.rs/serde), and `yoke` enables implementations of `Yokeable`
Expand Down
17 changes: 17 additions & 0 deletions utils/zerovec/src/map/kv.rs
Original file line number Diff line number Diff line change
Expand Up @@ -89,3 +89,20 @@ impl<'a> ZeroMapKV<'a> for String {
f(g)
}
}

impl<'a> ZeroMapKV<'a> for Vec<u8> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thought: this can be on Vec<T: ULE>, yes?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(also ZeroVec<'static, T: AsULE>)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since ZV is in flux, I'm going to merge this, with an action item to track this comment: #676 (comment)

type Container = VarZeroVec<'a, Vec<u8>>;
type NeedleType = [u8];
type GetType = [u8];
type SerializeType = [u8];
fn as_needle(&self) -> &[u8] {
self
}
fn cmp_get(&self, g: &[u8]) -> Ordering {
(&**self).cmp(g)
}

fn with_ser<R>(g: &[u8], f: impl FnOnce(&[u8]) -> R) -> R {
f(g)
}
}
27 changes: 25 additions & 2 deletions utils/zerovec/src/map/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -17,11 +17,34 @@ pub use kv::ZeroMapKV;
pub use vecs::ZeroVecLike;

/// A zero-copy map datastructure, built on sorted binary-searchable [`ZeroVec`]
/// and [`VarZeroVec`](crate::VarZeroVec).
/// and [`VarZeroVec`].
///
/// This type, like [`ZeroVec`] and [`VarZeroVec`](crate::VarZeroVec), is able to zero-copy
/// This type, like [`ZeroVec`] and [`VarZeroVec`], is able to zero-copy
/// deserialize from appropriately formatted byte buffers. It is internally copy-on-write, so it can be mutated
/// afterwards as necessary.
///
/// Internally, a `ZeroMap` is a zero-copy vector for keys paired with a zero-copy vector for
/// values, sorted by the keys. Therefore, all types used in `ZeroMap` need to work with either
/// [`ZeroVec`] or [`VarZeroVec`].
///
/// # Examples
///
/// ```
/// use zerovec::ZeroMap;
///
/// // Example byte buffer representing the map { 1: "one" }
/// let BINCODE_BYTES: &[u8; 31] = &[
/// 4, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 11, 0, 0, 0, 0, 0, 0, 0,
/// 1, 0, 0, 0, 0, 0, 0, 0, 111, 110, 101
/// ];
///
/// // Deserializing to ZeroMap requires no heap allocations.
/// let zero_map: ZeroMap<u32, String> = bincode::deserialize(BINCODE_BYTES)
/// .expect("Should deserialize successfully");
/// assert_eq!(zero_map.get(&1), Some("one"));
/// ```
///
/// [`VarZeroVec`]: crate::VarZeroVec
pub struct ZeroMap<'a, K, V>
where
K: ZeroMapKV<'a>,
Expand Down
36 changes: 33 additions & 3 deletions utils/zerovec/src/ule/plain.rs
Original file line number Diff line number Diff line change
Expand Up @@ -78,20 +78,50 @@ macro_rules! impl_byte_slice_type {
};
}

impl_byte_slice_size!(1);
impl_byte_slice_size!(2);
impl_byte_slice_size!(4);
impl_byte_slice_size!(8);
impl_byte_slice_size!(16);

impl_byte_slice_type!(u8, 1);
impl_byte_slice_type!(u16, 2);
impl_byte_slice_type!(u32, 4);
impl_byte_slice_type!(u64, 8);
impl_byte_slice_type!(u128, 16);

impl_byte_slice_type!(i8, 1);
impl_byte_slice_type!(i16, 2);
impl_byte_slice_type!(i32, 4);
impl_byte_slice_type!(i64, 8);
impl_byte_slice_type!(i128, 16);

// This is safe to implement because from_byte_slice_unchecked returns
// the same value as parse_byte_slice
unsafe impl ULE for u8 {
type Error = std::convert::Infallible;
#[inline]
fn parse_byte_slice(bytes: &[u8]) -> Result<&[Self], Self::Error> {
Ok(bytes)
}
#[inline]
unsafe fn from_byte_slice_unchecked(bytes: &[u8]) -> &[Self] {
bytes
}
#[inline]
fn as_byte_slice(slice: &[Self]) -> &[u8] {
slice
}
}

impl AsULE for u8 {
type ULE = Self;
#[inline]
fn as_unaligned(&self) -> Self::ULE {
*self
}
#[inline]
fn from_unaligned(unaligned: &Self::ULE) -> Self {
*unaligned
}
}

// EqULE is true because u8 is its own ULE.
unsafe impl EqULE for u8 {}