Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(iterators): allow key-only iteration #2092

Merged
merged 8 commits into from
Aug 19, 2024
Merged

Conversation

rymnc
Copy link
Member

@rymnc rymnc commented Aug 16, 2024

Linked Issues/PRs

Description

Implements iter_store_keys and iter_keys appropriately for in memory stores and rocksdb. The previous PR made them call it, and this one changes the underlying implementation to avoid value allocations.

Checklist

  • Breaking changes are clearly marked as such in the PR description and changelog (no breaking changes)
  • New behavior is reflected in tests (tests which test this behaviour under the hood still pass)
  • The specification matches the implemented behavior (no changes to spec)

Before requesting review

  • I have reviewed the code myself
  • I have created follow-up issues caused by this PR and linked them here

After merging, notify other teams

[Add or remove entries as needed]

@rymnc rymnc force-pushed the feat/iter-keys-trait-method branch 2 times, most recently from 9117e66 to f0c9c37 Compare August 16, 2024 07:34
@@ -706,6 +785,89 @@ where
}
}
}

fn iter_store_keys(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

taken mostly as-is from the iter_store fn defn, but only uses keys and uses _iter_all_keys under the hood.

I have an implementation which abstracts the functionality of using key prefixes and start index, but it seems verbose and might not provide much value.

Comment on lines 13 to 17
pub enum OwnedIteratorMode {
Start,
End,
From(Vec<u8>, rocksdb::Direction),
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Favouring "owning" the IteratorMode because our keys are relatively small, and to avoid all the lifetime annotations in places which use this to cut cognitive load.

@rymnc rymnc force-pushed the feat/iter-keys-trait-method branch from f0c9c37 to baec764 Compare August 16, 2024 08:00
@rymnc rymnc self-assigned this Aug 16, 2024
Comment on lines 170 to 175
// We cannot define iter_store_keys appropriately for the `ChangesIterator`,
// because we have to filter out the keys that were removed, which are
// marked as `WriteOperation::Remove` in the value
self.iter_store(column, prefix, start, direction)
.map(|item| item.map(|(key, _)| key))
.into_boxed()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could copy/paste everything from the function above and just remove value.clone()=)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, addressed in 06bd0e9
thanks

start: Option<&[u8]>,
direction: IterDirection,
) -> BoxedIter<fuel_core_storage::kv_store::KeyItem> {
self.iter_all(column, prefix, start, direction)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit] We could avoid value.clone() if we implemented that in some another way=) Maybe we could use generics to minimize the code duplciation

Copy link
Member Author

@rymnc rymnc Aug 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, maybe we should change the fn signature of iter_all_keys in IteratorOverTable to include the column, prefix, start key. it is currently -

    fn iter_all_keys<M>(
        &self,
        direction: Option<IterDirection>,
    ) -> BoxedIter<super::Result<M::OwnedKey>>
    where
        M: Mappable,
        Self: IterableTable<M>,
    {
        self.iter_all_filtered_keys::<M, [u8; 0]>(None, None, direction)
    }

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nevermind, fixed in 1665b86

thanks!

@@ -499,6 +505,55 @@ where
}
}

fn reverse_prefix_iter_keys(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reverse_prefix_iter function is pretty complex on its own, and having two complex functions will be hard to support=D Maybe we could use the generic, which will define what type to return and defines the closure to get that type?

Copy link
Member Author

@rymnc rymnc Aug 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

totally agree, however, since the types we want to implement for, namely KVItem and KeyItem are a tuple and a distinct type, writing a generic for this becomes more involved. We will have to define a trait for both KVItem and KeyItem which provide the following helper fn -

trait StorageItem {
  type Key = Vec<u8>;

  fn key() -> Self::Key;
  fn key_starts_with(prefix: Self::Key) -> bool; 
}

which makes it easier to write the function, in the following way -

fn reverse_prefix_iter_generic<T, F>(
        &self,
        prefix: &[u8],
        column: Description::Column,
        iter_store_fn: F,
        iter_all_fn: impl Fn(
            &Self,
            Description::Column,
            ReadOptions,
            OwnedIteratorMode,
        ) -> BoxedIter<StorageResult<T>>,
    ) -> BoxedIter<StorageResult<T>>
    where
        T: 'static + StorageItem,
        F: Fn(
            &Self,
            Description::Column,
            Option<&[u8]>,
            Option<&[u8]>,
            IterDirection,
        ) -> BoxedIter<'static, StorageResult<T>>,
    {
        let maybe_next_item = next_prefix(prefix.to_vec())
            .and_then(|next_prefix| {
                iter_store_fn(
                    self,
                    column,
                    Some(next_prefix.as_slice()),
                    None,
                    IterDirection::Forward,
                )
                .next()
            })
            .and_then(|res| res.ok());

        let prefix_vec = prefix.to_vec(); 
        
        match maybe_next_item {
            Some(key_or_item) => {   
                let next_start_key_vec = key_or_item.key().to_vec();

                let iter_mode = OwnedIteratorMode::From(
                    next_start_key_vec.to_vec(),
                    rocksdb::Direction::Reverse,
                );

                iter_all_fn(self, column, self.read_options(), iter_mode)
                    .skip(1)
                    .take_while(move |item| match item {
                        Ok(key_or_item) => key_or_item.key_starts_with(&prefix_vec), 
                        Err(_) => true,
                    })
                    .into_boxed()
            }
            None => {
                iter_all_fn(self, column, self.read_options(), OwnedIteratorMode::End)
                    .take_while(move |item| match item {
                        Ok(key_or_item) => key_or_item.key_starts_with(&prefix_vec), 
                        Err(_) => true,
                    })
                    .into_boxed()
            }
        }
    }

the important line of code in the above fn being let next_start_key_vec = key_or_item.key().to_vec();

to avoid destructuring the tuple since the generic T can either be a KVItem (a tuple) or a KeyItem (Vec)

perhaps this can be addressed in a follow up PR to avoid a large diff :)

what are your thoughts?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xgreenx check out the diff in e449766, I added it there and if we want to include as a follow up pr, i can create it

if self.done {
None
} else if let Some(key) = self.raw.key() {
let key = key.to_vec();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious. Is it possible to return a slice instead of the vector? In most cases we decode key and don't use the vector itself

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not without adding additional lifetime constraints :/

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, we can try to play with it in follow up PR=)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rymnc rymnc marked this pull request as ready for review August 17, 2024 09:38
@rymnc rymnc requested a review from a team August 17, 2024 13:10
@rymnc rymnc force-pushed the feat/iter-keys-trait-method branch from 1665b86 to fd1ef3c Compare August 17, 2024 13:15
@rymnc rymnc force-pushed the feat/iter-keys-trait-method branch from fd1ef3c to fcf025a Compare August 19, 2024 08:07
@Dentosal Dentosal requested a review from xgreenx August 19, 2024 09:35
CHANGELOG.md Outdated
@@ -31,6 +31,7 @@ and this project adheres to [Semantic Versioning](http://semver.org/).

### Added
- [2061](https://github.com/FuelLabs/fuel-core/pull/2061): Allow querying filled transaction body from the status.
- [2092](https://github.com/FuelLabs/fuel-core/pull/2092): Allow iterating by keys in rocksdb, and other storages.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After merge with master, this is in the wrong section

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed in ad29d89

@rymnc rymnc force-pushed the feat/iter-keys-trait-method branch from 1c8f306 to ad29d89 Compare August 19, 2024 09:41
@rymnc rymnc requested a review from Dentosal August 19, 2024 12:14
})
.and_then(|res| res.ok());

let prefix_vec = prefix.to_vec(); // Convert `prefix` to an owned `Vec<u8>`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This probably doesn't need to be to_vec you can just use the slice below.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like it will be painful trying to get the lifetimes to align here. It's probable that we can just use a constant sized array in the future, rather than a vec, in which case this shouldn't be a problem.

Not really a problem now either, but worth optimizing for sure.

xgreenx
xgreenx previously approved these changes Aug 19, 2024
Dentosal
Dentosal previously approved these changes Aug 19, 2024
@rymnc rymnc dismissed stale reviews from Dentosal and xgreenx via 3a9fedc August 19, 2024 15:26
@rymnc rymnc enabled auto-merge (squash) August 19, 2024 15:27
@rymnc rymnc merged commit b1f2bf4 into master Aug 19, 2024
26 of 27 checks passed
@rymnc rymnc deleted the feat/iter-keys-trait-method branch August 19, 2024 16:28
@xgreenx xgreenx mentioned this pull request Aug 20, 2024
xgreenx added a commit that referenced this pull request Aug 20, 2024
## Version v0.34.0

### Added
- [2051](#2051): Add support
for AWS KMS signing for the PoA consensus module. The new key can be
specified with `--consensus-aws-kms AWS_KEY_ARN`.
- [2092](#2092): Allow
iterating by keys in rocksdb, and other storages.
- [2096](#2096): GraphQL
endpoint to fetch blob byte code by its blob ID.

### Changed
- [2106](#2106): Remove
deadline clock in POA and replace with tokio time functions.

#### Breaking
- [2051](#2051): Misdocumented
`CONSENSUS_KEY` environ variable has been removed, use
`CONSENSUS_KEY_SECRET` instead. Also raises MSRV to `1.79.0`.

### Fixed

- [2106](#2106): Handle the
case when nodes with overriding start on the fresh network.
- [2105](#2105): Fixed the
rollback functionality to work with empty gas price database.

## What's Changed
* doc: refine the transaction example in the README by @mmyyrroonn in
#2072
* AWS KMS block signing support and Rust 1.79 by @Dentosal in
#2051
* feat(iterators): allow key-only iteration by @rymnc in
#2092
* feat: graphql endpoint to fetch the blob byte code by its blob ID by
@netrome in #2096
* Small improvements for tests to make them more stable by @xgreenx in
#2103
* Fixed the rollback functionality to work with empty gas price database
by @xgreenx in #2105
* Bump wasmtime version by @Dentosal in
#2089
* Handle the case when nodes with overriding start on the fresh network
by @xgreenx in #2106
* Remove deadline clock in POA and replace with tokio time functions. by
@AurelienFT in #2109

## New Contributors
* @mmyyrroonn made their first contribution in
#2072

**Full Changelog**:
v0.33.0...v0.34.0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants