Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: caching LookupSet implementation #654

Merged
merged 2 commits into from
Dec 9, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
- Add consts for near, yocto, and tgas. [PR 640](https://github.com/near/near-sdk-rs/pull/640).
- `near_sdk::ONE_NEAR`, `near_sdk::ONE_YOCTO`, `near_sdk::Gas::ONE_TERA`
- Update SDK dependencies for `nearcore` crates used for mocking (`0.10`) and `borsh` (`0.9`)
- store: Implement caching `LookupSet` type. This is the new iteration of the previous version of `near_sdk::collections::LookupSet` that has an updated API, and is located at `near_sdk::store::LookupSet`. [PR 654](https://github.com/near/near-sdk-rs/pull/654).

## `4.0.0-pre.4` [10-15-2021]
- Unpin `syn` dependency in macros from `=1.0.57` to be more composable with other crates. [PR 605](https://github.com/near/near-sdk-rs/pull/605)
Expand Down
1 change: 0 additions & 1 deletion near-sdk/src/store/lookup_map/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -95,7 +95,6 @@ where
hasher: PhantomData<H>,
}

// #[derive(Default)]
itegulov marked this conversation as resolved.
Show resolved Hide resolved
struct EntryAndHash<V> {
value: OnceCell<CacheEntry<V>>,
hash: OnceCell<[u8; 32]>,
Expand Down
16 changes: 16 additions & 0 deletions near-sdk/src/store/lookup_set/impls.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
use super::LookupSet;
use crate::crypto_hash::CryptoHasher;
use borsh::BorshSerialize;

impl<T, H> Extend<T> for LookupSet<T, H>
where
T: BorshSerialize + Ord,
H: CryptoHasher<Digest = [u8; 32]>,
{
fn extend<I>(&mut self, iter: I)
where
I: IntoIterator<Item = T>,
{
self.map.extend(iter.into_iter().map(|k| (k, ())))
}
}
288 changes: 288 additions & 0 deletions near-sdk/src/store/lookup_set/mod.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,288 @@
mod impls;

use crate::crypto_hash::{CryptoHasher, Sha256};
use crate::store::LookupMap;
use crate::IntoStorageKey;
use borsh::{BorshDeserialize, BorshSerialize};
use std::borrow::Borrow;
use std::fmt;

#[derive(BorshSerialize, BorshDeserialize)]
pub struct LookupSet<T, H = Sha256>
where
T: BorshSerialize + Ord,
H: CryptoHasher<Digest = [u8; 32]>,
{
map: LookupMap<T, (), H>,
}

impl<T, H> Drop for LookupSet<T, H>
where
T: BorshSerialize + Ord,
H: CryptoHasher<Digest = [u8; 32]>,
{
fn drop(&mut self) {
self.flush()
}
}

impl<T, H> fmt::Debug for LookupSet<T, H>
where
T: BorshSerialize + Ord,
H: CryptoHasher<Digest = [u8; 32]>,
{
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
f.debug_struct("LookupSet").field("map", &self.map).finish()
}
}

impl<T> LookupSet<T, Sha256>
where
T: BorshSerialize + Ord,
{
#[inline]
pub fn new<S>(prefix: S) -> Self
where
S: IntoStorageKey,
{
Self::with_hasher(prefix)
}
}

impl<T, H> LookupSet<T, H>
where
T: BorshSerialize + Ord,
H: CryptoHasher<Digest = [u8; 32]>,
{
/// Initialize a [`LookupSet`] with a custom hash function.
///
/// # Example
/// ```
/// use near_sdk::crypto_hash::Keccak256;
/// use near_sdk::store::LookupSet;
///
/// let map = LookupSet::<String, Keccak256>::with_hasher(b"m");
/// ```
pub fn with_hasher<S>(prefix: S) -> Self
where
S: IntoStorageKey,
{
Self { map: LookupMap::with_hasher(prefix) }
}

/// Returns `true` if the set contains the specified value.
///
/// The value may be any borrowed form of the set's value type, but
/// [`BorshSerialize`], [`ToOwned<Owned = T>`](ToOwned) and [`Ord`] on the borrowed form *must*
/// match those for the value type.
pub fn contains<Q: ?Sized>(&self, value: &Q) -> bool
where
T: Borrow<Q>,
Q: BorshSerialize + ToOwned<Owned = T> + Ord,
{
self.map.contains_key(value)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noting that one unintentional property that it will only cache the result if the map does not contain the key. The reason the map can't cache the fact that the key does exist, is because it does not actually read the value from storage.

}

/// Adds a value to the set.
///
/// If the set did not have this value present, true is returned.
///
/// If the set did have this value present, false is returned.
pub fn insert(&mut self, value: T) -> bool
where
T: Clone,
{
self.map.insert(value, ()).is_none()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will note that for things like this, it will incur a larger storage cost. If you look at https://github.com/near/nearcore/blob/d0087f6c5006c1bf581e2eb30ab430e89cb1bc74/nearcore/res/mainnet_genesis.json#L157 you can see storage_read_base (and per-key byte) is slightly larger than storage_has_key_base. When doing this remove, it is doing a read of the bytes rather than just checking if it exists.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the link, I will use it as a reference in the future

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, maybe I am misunderstanding something here, but I am confused what do storage_read_base and storage_has_key_base have to do with LookupMap::insert. From what I see it just uses env::storage_write (that happens in flush()) and nothing else. I don't see a more efficient way of writing a key to the trie other than writing the key itself and 0 bytes as the value (from what I understand writing charges a fixed base cost + per byte rate for both key and value data).

}

/// Removes a value from the set. Returns whether the value was present in the set.
///
/// The value may be any borrowed form of the set's value type, but
/// [`BorshSerialize`], [`ToOwned<Owned = K>`](ToOwned) and [`Ord`] on the borrowed form *must*
/// match those for the value type.
pub fn remove<Q: ?Sized>(&mut self, value: &Q) -> bool
where
T: Borrow<Q>,
Q: BorshSerialize + ToOwned<Owned = T> + Ord,
{
self.map.remove(value).is_some()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as before

}
}

impl<T, H> LookupSet<T, H>
where
T: BorshSerialize + Ord,
H: CryptoHasher<Digest = [u8; 32]>,
{
/// Flushes the intermediate values of the set before this is called when the structure is
/// [`Drop`]ed. This will write all modified values to storage but keep all cached values
/// in memory.
pub fn flush(&mut self) {
self.map.flush()
}
}

#[cfg(not(target_arch = "wasm32"))]
#[cfg(test)]
mod tests {
use super::LookupSet;
use crate::crypto_hash::{Keccak256, Sha256};
use crate::test_utils::test_env::setup_free;
use arbitrary::{Arbitrary, Unstructured};
use rand::seq::SliceRandom;
use rand::RngCore;
use rand::{Rng, SeedableRng};
use std::collections::HashSet;

#[test]
fn test_insert_contains() {
let mut set = LookupSet::new(b"m");
let mut rng = rand_xorshift::XorShiftRng::seed_from_u64(0);
let mut baseline = HashSet::new();
for _ in 0..100 {
let value = rng.gen::<u64>();
set.insert(value);
baseline.insert(value);
}
// Non existing
for _ in 0..100 {
let value = rng.gen::<u64>();
assert_eq!(set.contains(&value), baseline.contains(&value));
}
// Existing
for value in baseline.iter() {
assert!(set.contains(value));
}
}

#[test]
fn test_insert_remove() {
let mut set = LookupSet::new(b"m");
let mut rng = rand_xorshift::XorShiftRng::seed_from_u64(1);
let mut values = vec![];
for _ in 0..100 {
let value = rng.gen::<u64>();
values.push(value);
set.insert(value);
}
values.shuffle(&mut rng);
for value in values {
assert!(set.remove(&value));
}
}

#[test]
fn test_remove_last_reinsert() {
let mut set = LookupSet::new(b"m");
let value1 = 2u64;
set.insert(value1);
let value2 = 4u64;
set.insert(value2);

assert!(set.remove(&value2));
assert!(set.insert(value2));
}

#[test]
fn test_extend() {
let mut set = LookupSet::new(b"m");
let mut rng = rand_xorshift::XorShiftRng::seed_from_u64(4);
let mut values = vec![];
for _ in 0..100 {
let value = rng.gen::<u64>();
values.push(value);
set.insert(value);
}
for _ in 0..10 {
let mut tmp = vec![];
for _ in 0..=(rng.gen::<u64>() % 20 + 1) {
let value = rng.gen::<u64>();
tmp.push(value);
}
values.extend(tmp.iter().cloned());
set.extend(tmp.iter().cloned());
}

for value in values {
assert!(set.contains(&value));
}
}

#[test]
fn test_debug() {
let set = LookupSet::<u8, Sha256>::new(b"m");

assert_eq!(format!("{:?}", set), "LookupSet { map: LookupMap { prefix: [109] } }")
}

#[test]
fn test_flush_on_drop() {
let mut set = LookupSet::<_, Keccak256>::with_hasher(b"m");

// Set a value, which does not write to storage yet
set.insert(5u8);
assert!(set.contains(&5u8));

// Drop the set which should flush all data
drop(set);

// Create a duplicate set which references same data
let dup_set = LookupSet::<u8, Keccak256>::with_hasher(b"m");

// New map can now load the value
assert!(dup_set.contains(&5u8));
}

#[derive(Arbitrary, Debug)]
enum Op {
Insert(u8),
Remove(u8),
Flush,
Restore,
Contains(u8),
}

#[test]
fn test_arbitrary() {
setup_free();

let mut rng = rand_xorshift::XorShiftRng::seed_from_u64(0);
let mut buf = vec![0; 4096];
for _ in 0..512 {
// Clear storage in-between runs
crate::mock::with_mocked_blockchain(|b| b.take_storage());
rng.fill_bytes(&mut buf);

let mut ls = LookupSet::new(b"l");
let mut hs = HashSet::new();
let u = Unstructured::new(&buf);
if let Ok(ops) = Vec::<Op>::arbitrary_take_rest(u) {
for op in ops {
match op {
Op::Insert(v) => {
let r1 = ls.insert(v);
let r2 = hs.insert(v);
assert_eq!(r1, r2)
}
Op::Remove(v) => {
let r1 = ls.remove(&v);
let r2 = hs.remove(&v);
assert_eq!(r1, r2)
}
Op::Flush => {
ls.flush();
}
Op::Restore => {
ls = LookupSet::new(b"l");
}
Op::Contains(v) => {
let r1 = ls.contains(&v);
let r2 = hs.contains(&v);
assert_eq!(r1, r2)
}
}
}
}
}
}
}
3 changes: 3 additions & 0 deletions near-sdk/src/store/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,9 @@ pub use vec::Vector;
pub mod lookup_map;
pub use self::lookup_map::LookupMap;

mod lookup_set;
pub use self::lookup_set::LookupSet;

pub mod unordered_map;
pub use self::unordered_map::UnorderedMap;

Expand Down