-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
update seek to seek_prefix #11
Conversation
Add Temperature enum and NodeStorage struct which introduces a layer between node’s storage and Store interface. As it stand, this is really just complicating code, however, the eventual goal is to have hot and cold storage with NodeStorage being interface to access them. Store struct will offer access to either of those databases. The Temperature enum is a ZST so in the current form it should be optimised out at compile time.
Contract cache now lives in contract-runtime, this code is just an elaborate indirection, we can get rid of it.
Introduce ColdDatabase struct which wraps a real Database interface and transparently translates reads and writes as expected by the users of Database interface to reads and writes as they will happen in cold storage. This is needed because we want to optimise cold storage slightly (e.g. by simplifying key in State column and removing reference counts).
Standard rocksdb `flush()` only flushes the default column family. See https://github.com/facebook/rocksdb/blob/95ef007adc9365fbefc0f957722a191c1fd7dcd3/include/rocksdb/db.h#L1398-L1400 To flush all column families as intended, iterator over them and flush them individually. This change only affects the runtime parameter estimator. In the productive client, we never force a flush. (As of today.)
Firstly, 1.29 release cycle has started so start fuzzing 1.29.0. Secondly, 1.28.1 has been released so switch to fuzzing that rather than 1.28.0. And lastly, with all that don’t waste time fuzzing 1.27 any longer.
Just minor cleanups: * remove needles `&mut` * flatten the entry point a bit * collapse two unwraps into one
) Add flat_state::maybe_new function which creates FlatState object when requested and the feature is enabled. This removes bunch of annotations and noise from shard_tries.rs. Furthermore, split definition of FlateState to one when the feature is enabled and another when it’s disabled. This further removes bunch of annotations this time at definition place. Furthermore, by making the latter impossible to instantiate (by using an enum with no variants) this encodes in the type system that the flat state won’t be created if the feature is not enabled and its method won’t be called. This also makes `Option<FlatState>` a ZST.
Since PeerManager has uses the database in a way noticeably different than the rest of the code (that is, it just stores mutable data there rather than appending new information to the growing chain), PeerStore bypasses the Store abstraction. Despite that, when creating a new PeerManagerActor a Store instance needs to be passed as constructor argument. The Store abstraction is then pealed away to get the inner Database object. Change the code so that the Database is passed directly and pealing of the abstraction happens at the call site. This will also simplify future code which adds NodeStorage and Temperature abstractions.
This PR introduces two changes: 1. Complete the logic that make nodes panic if their protocol version is old. Previously, the check was only triggered at epoch boundaries. If validators restart their nodes, their nodes will keep running. 2. Introduce a new protocol feature that forbids blocks produced with old protocol versions. This can help kick out validators that are not upgraded.
This PR simplifies the code around simple nightsahde shard layout upgrade. Instead of adding the hacky field `simple_nightsahde_shard_layout` in genesis, now the logic is moved to inside AllEpochConfig
…cs (near#7523) * network: introduce near_peer_message_sent_by_type_{bytes,total} metrics * fmt
Currently, when data is read from RocksDB it’s converted from RocksDB’s internal type (specifically DBPinnableSlice) to a vector. The problem is that we often end up discarding this vector just as soon as it’s created. For example, when deserialising objects using get_ser method an owned vector is unnecessary. And when accessing DBCol::State the data is converted into an Arc which forces another allocation and memory copy. Introduce DBBytes type which is what Database::get_raw_bytes and Database::get_with_rc_stripped methods return. For RocksDB the object holds RocksDB specific DBPinnableSlice which than caller can treat as a slice or convert to a vector or an arc as desired. Furthermore, change Store::get to return data in this form. Furthermore, for convenience, introduce Store::get_vec which does conversion to a vector.
* direct routed message * fixed tests * compilation fix * applied comments * removing unnecessary clones * compilation fix Co-authored-by: near-bulldozer[bot] <73298989+near-bulldozer[bot]@users.noreply.github.com>
The test broke after recent changes to tracked shards configuration options. Fix it.
There was a compilation determinism issue in our instrumentation, due to iterating over a HashMap when generating the thunks. With this, I add: - A fuzzer that tries to automatically detect non-determinism instances - An output from this fuzzer that generates said non-determinism; auto-generated so that when running `cargo test` the regression would be caught - The fix for the compilation determinism issue per se I've been running the fuzzer locally for a bit to check there's no other obvious instance of compilation non-determinism, and coverage has already doubled since the fuzzer found the crash I checked in so hopefully other instances of compilation non-determinism would be hard to find. cc @akhi3030 @matklad
because the test relies on the string in a log. Passing nayduck run: https://nayduck.near.org/#/test/358900
This change will need to be reverted before mainnet protocol version reaches 100.
Tested with localnet
Set the cost of IO read and write bytes in qemu based estimations to 0. The only real purpose of having qemu based measurements today is having a stable output that can be used to detect performance regressions. Since near#7484 this no longer works because read IO proofs to be unstable. Setting IO costs to 0 should stabilize gas costs at least partially. Also setting default estimation to `time` to better reflect how we use it now.
This avoids an `expect` present in the second invocation.
See near#7287 for similar work.
See near#7287 for similar work. This should be the final PR that is based splitting out tests based on `ProtocolFeature`. I suspect that we could split some more tests from `process_blocks.rs` as well. But we will have to figure out how to organise them.
This is part of a preparation to support TIER1 peers. I've moved as much logic as possible inside the constructor, so that the number of arguments is minimal. I've also added a PeerActor::Config which is passed to the constructor, because PeerActor::new cannot return an error (due to Actix API constraints). Also moved accepting new connections from PeerManagerActor to NetworkState.
Allow to select only one metric instead of always running both in `estimator-warehouse estimate`. This is necessary because `icount` estimations have become very slow.
@@ -224,7 +224,7 @@ impl<'a> TrieUpdateIterator<'a> { | |||
} | |||
None => None, | |||
}; | |||
trie_iter.seek(&start_offset)?; | |||
trie_iter.seek_prefix(&start_offset)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The !key.starts_with(prefix)
check in stop_cond
on line 264 is no longer necessary.
@@ -948,7 +948,7 @@ mod tests { | |||
let root = test_populate_trie(&tries, &Trie::EMPTY_ROOT, ShardUId::single_shard(), changes); | |||
let trie = tries.get_trie_for_shard(ShardUId::single_shard(), root.clone()); | |||
let mut iter = trie.iter().unwrap(); | |||
iter.seek(&vec![0, 116, 101, 115, 116, 44]).unwrap(); | |||
iter.seek_prefix(&vec![0, 116, 101, 115, 116, 44]).unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test should now pass if we change pairs[..2]
to just pairs
on line 957.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It fails
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will need to create a new element
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added on 591a2e4
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, that’s wrong. It should pass without that element. It doesn’t because the seek_prefix implementation isn’t correct. Similarly, assert!(key.starts_with(&query),
mentioned in another comment fails the test because the current implementation is incorrect.
When profiling memory usage, I've noticed non-trivial amount of space is occupied by net IO buffers. So, let's add metrics to track their size! We use SockAddr, rather than `PeerId`, as a label here, as we don't know peer ids for outgoing connections. I don't worry much about perf hit here -- we more or less do a syscall here, that should dominate metric bump. The implementation is a bit hacky, but the "proper" fix would probably require unforking the FramedRead/Write infrastructure.
Debug page now has as section for catchup information <img width="785" alt="Screen Shot 2022-09-02 at 11 58 39 AM" src="https://user-images.githubusercontent.com/34969888/188193706-291ae020-888d-4215-ac75-bd43056d280b.png"> Catchup info will also be printed as part of stats in log. ``` 2022-09-02T15:58:30.730076Z INFO stats: Catchups Sync block `CZL3BCD15QpZ4FLqWu7vAJ97JsVHJ1ngtZQupfhiVaL6`@61 Shard sync status: Shard 0 done, Shard 1 split applying total parts 3 done 1, Shard 2 split applying not started, Shard 3 split applying not started Next blocks to catch up: `CZL3BCD15QpZ4FLqWu7vAJ97JsVHJ1ngtZQupfhiVaL6`@61 ```
…ease (near#7538) * add feature RejectBlocksWithOutdatedProtocolVersions to the next shardnet release * address comments
Deprecate near_peer_message_received_total Prometheus metric in favour of aggregating near_peer_message_received_by_type_total metric instead.
Taking slice as argument is more flexible than taking reference to a vector since the caller does not need to have to own the data. In addition, slices also work if caller owns a boxed slice which would need to be converted into a vector if function required reference to a vector. Slices also allow to take sub-slices easily. The only remaining place is Store::push_component in near-network crate. This is needed because near-network’s schema’s StoreUpdate::set doesn’t work with borrowed types. More generic magic is needed to make it work.
Separate crate for test-only logging utils doesn't really pull its weight, let's just add a `testonly` module to the general o11y crate. Especially given that there already were some prod dependencies on this crate.
…ear#7544) With cold storage approaching, it’s gonig to be necessary to be able to open RocksDB instance with only a subset of defined columns. Change how RocksDB deals with column family handles. Specifically, get_cf_handles now accepts an iterator over DBCol values as argument and cf_handle will return an error if trying to get a handle for a column that wasn’t set up.
Hey @mina86, could you review the changes we've just pushed? It's not 100% correct, but when trying to fix the Index: core/store/src/trie/mod.rs
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/core/store/src/trie/mod.rs b/core/store/src/trie/mod.rs
--- a/core/store/src/trie/mod.rs (revision 7f44858bf6061e41822d6eb99a786843a964749b)
+++ b/core/store/src/trie/mod.rs (date 1662386318080)
@@ -896,8 +896,8 @@
let mut other_iter = trie.iter().unwrap();
other_iter.seek_prefix(b"r").unwrap();
- assert_eq!(other_iter.next().unwrap().unwrap().0, b"x".to_vec());
- assert_eq!(other_iter.next().unwrap().unwrap().0, b"y".to_vec());
+ // assert_eq!(other_iter.next().unwrap().unwrap().0, b"x".to_vec());
+ // assert_eq!(other_iter.next().unwrap().unwrap().0, b"y".to_vec());
assert_eq!(other_iter.next(), None);
other_iter.seek_prefix(b"x").unwrap();
assert_eq!(other_iter.next().unwrap().unwrap().0, b"x".to_vec());
@@ -1059,7 +1059,8 @@
let mut iterator = trie.iter().unwrap();
iterator.seek_prefix(&query).unwrap();
if let Some(Ok((key, _))) = iterator.next() {
- assert!(key >= query);
+ assert!(key.starts_with(&query), "‘{key:x?}’ does not start with ‘{query:x?}’");
+ // assert!(key.starts_with(&query), "‘{key:x?}’ does not start with ‘{query:x?}’");
}
}
}
Index: core/store/src/trie/iterator.rs
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/core/store/src/trie/iterator.rs b/core/store/src/trie/iterator.rs
--- a/core/store/src/trie/iterator.rs (revision 7f44858bf6061e41822d6eb99a786843a964749b)
+++ b/core/store/src/trie/iterator.rs (date 1662386844161)
@@ -107,13 +107,22 @@
self.descend_into_node(&hash)?;
let Crumb { status, node, prefix_boundary } = self.trail.last_mut().unwrap();
prev_prefix_boundary = prefix_boundary;
- match &node.node {
+ match dbg!(&node.node) {
TrieNode::Empty => break,
TrieNode::Leaf(leaf_key, _) => {
let existing_key = NibbleSlice::from_encoded(leaf_key).0;
- if existing_key < key {
- self.key_nibbles.extend(existing_key.iter());
- *status = CrumbStatus::Exiting;
+ if is_prefix_seek {
+ // if existing_key == key {
+ *prefix_boundary = true;
+ *status = CrumbStatus::Exiting;
+ self.key_nibbles.extend(existing_key.iter());
+ break;
+ // }
+ } else {
+ if existing_key < key {
+ self.key_nibbles.extend(existing_key.iter());
+ *status = CrumbStatus::Exiting;
+ }
}
break;
}
@@ -128,18 +137,29 @@
hash = *child.unwrap_hash();
key = key.mid(1);
} else {
+ println!("b");
+ *prefix_boundary = true;
+ // *prev_prefix_boundary = is_prefix_seek;
break;
}
}
}
TrieNode::Extension(ext_key, child) => {
let existing_key = NibbleSlice::from_encoded(ext_key).0;
+
if key.starts_with(&existing_key) {
key = key.mid(existing_key.len());
hash = *child.unwrap_hash();
*status = CrumbStatus::At;
self.key_nibbles.extend(existing_key.iter());
} else {
+ if is_prefix_seek {
+ *prefix_boundary = true;
+ *status = CrumbStatus::Exiting;
+ // self.key_nibbles.extend(existing_key.iter());
+ break;
+ }
+
if existing_key < key {
*status = CrumbStatus::Exiting;
self.key_nibbles.extend(existing_key.iter());
|
check_free_space_interval and free_space_threshold fields of RocksDB struct have always the same value. Turn them into constants.
Extend test_iterator_seek by adding verification of known existing keys.
Metrics *are* part of o11y package, it's much better organizationally to not split this stuff into two crates.
The database version definitions make more sense inside of near-store than near-primitives. Especially since majority of usage of them is in near-store and the definitions aren’t used for anything specific to the chain or other operations not involving the storage. While at it, change RocksDB::get_version such that it checks whether the database exists (and returns None if it doesn’t) and opens the database with only DbVersion column configured (so that’ll properly read version from old databases without modifying them).
Co-authored-by: Aleksey Kladov <aleksey.kladov@gmail.com>
By the way, could you make this a PR against near/nearcore repository and allow edits from maintainers? It’s going to be easier for me to follow up with merges and quick fixes. |
Nevermind. I was fetching the wrong code. |
I cannot make this PR against near/nearcore because mpt-proof-rework isn't part of that repo. Will allow edit access from maintainers |
Yes, it needs to be rebased with just the changes related to seek_prefix. Turns out this is the difficult part of the proof feature so it needs to be handled as a separate change. |
Turns out this is a bit more complicated than I’ve initially imagine, but this should help: @@ -62,26 +67,50 @@ impl<'a> TrieIterator<'a> {
}
/// Position the iterator on the first element with key => `key`.
- pub fn seek<K: AsRef<[u8]>>(&mut self, key: K) -> Result<(), StorageError> {
- self.seek_nibble_slice(NibbleSlice::new(key.as_ref())).map(drop)
+ pub fn seek_prefix<K: AsRef<[u8]>>(&mut self, key: K) -> Result<(), StorageError> {
+ self.seek_nibble_slice(NibbleSlice::new(key.as_ref()), true).map(drop)
}
/// Returns the hash of the last node
pub(crate) fn seek_nibble_slice(
&mut self,
mut key: NibbleSlice<'_>,
+ is_prefix_seek: bool,
) -> Result<CryptoHash, StorageError> {
self.trail.clear();
self.key_nibbles.clear();
+
+ // Checks if a key in an extension or leaf matches our search query.
+ //
+ // When doing prefix seek, this checks whether `key` is a prefix of
+ // `ext_key`. When doing regular range seek, this checks whether `key`
+ // is no greater than `ext_key`. If those conditions aren’t met, the
+ // node with `ext_key` should not match our query.
+ let check_ext_key = |key: &NibbleSlice, ext_key: &NibbleSlice| {
+ if is_prefix_seek {
+ ext_key.starts_with(key)
+ } else {
+ ext_key >= key
+ }
+ };
+
let mut hash = self.trie.root;
+ let mut prev_prefix_boundary = &mut false;
loop {
+ *prev_prefix_boundary = is_prefix_seek;
self.descend_into_node(&hash)?;
- let Crumb { status, node } = self.trail.last_mut().unwrap();
+ let Crumb { status, node, prefix_boundary } = self.trail.last_mut().unwrap();
+ prev_prefix_boundary = prefix_boundary;
match &node.node {
TrieNode::Empty => break,
TrieNode::Leaf(leaf_key, _) => {
let existing_key = NibbleSlice::from_encoded(leaf_key).0;
- if existing_key < key {
+ if !check_ext_key(&key, &existing_key) {
self.key_nibbles.extend(existing_key.iter());
*status = CrumbStatus::Exiting;
}
@@ -98,6 +127,7 @@ impl<'a> TrieIterator<'a> {
hash = *child.unwrap_hash();
key = key.mid(1);
} else {
+ *prefix_boundary = is_prefix_seek;
break;
}
}
@@ -110,9 +139,9 @@ impl<'a> TrieIterator<'a> {
*status = CrumbStatus::At;
self.key_nibbles.extend(existing_key.iter());
} else {
- if existing_key < key {
+ if !check_ext_key(&key, &existing_key) {
self.key_nibbles.extend(existing_key.iter());
*status = CrumbStatus::Exiting;
}
break;
} When doing prefix seek, we need to check whether There are still some test failures in the PR but this appears to fix test_iterator. |
also adjusted a bit the internal API for sending routed messages.
The serde::Serialize and serde::Deserialize implementations for the VMContext type are never used. Get rid of them. This makes base58_format unused so get rid of it as well.
Instead of introducing a new dependency, we can generate request id using generate_random_string function. The id doesn’t really matter anyway (apart from JSON RPC specification requiring it to be present) so any string really will work there.
Just a tiny bit of refactoring separating configuration part of the storage and code responsible for opening the storage into separate modules.
The default Debug derive for types is quite verbose with a lot of vertical spacing. For example, TrieNode branches are printed as a list with each empty branch as a None on its own line. This isn’t very ergonomic when trying to debug trie nodes and printing a lot of the nodes to output. Replace the default Debug implementation for TrieNode, NodeHandle and ValueHandle by custom ones which are more concise and don’t waste space on empty nodes.
This will be needed in future commit where we will need access to the raw bytes of the node. See near#7509.
Could you update the PR to be onto master branch and remove proof-related changes from it? At this point it’s probably easiest to squash everything and clean it up: git remote update
git checkout update-iterator-seek
git branch update-iterator-seek-backup
base=$(git merge-base @ upstream/master)
# ^^^^^^^^ if necessary, adjust ‘ustream’ to whatever the
# name of the near/nearcore remote is in the repository
git reset "$base"
git gui With git commit -m seek_prefix
git push origin +@:update-iterator-seek |
7d3e274
to
c1fd6de
Compare
No description provided.