-
Notifications
You must be signed in to change notification settings - Fork 993
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixmmr part2 #3666
Fixmmr part2 #3666
Conversation
…tree_postorder_height; add round_up_to_leaf_pos method
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, spent a good while looking over changes here as well as coming to terms with them. It's a very large PR but obviously needs to be giving how sweeping a simple change to insertion_to_pmmr_index
is :D - A few general thoughts:
- It's unfortunate that we can't universally replace everything with 0-indexed pmmrs (for obvious reasons,) but at least this pushes a lot of the +/- 1 conversions out towards the extremities where the code is simpler and less likely to cause confusion.
- I really wish we had more tests and data test sets, particularly ones that exercise changes against older versions of chain data. Most of our PMMR-related tests work on trivially small, easy to understand constructed sample data instead of exercising them against live data volumes, which means much more manual testing is required for changes like these than there should be.
- Also really wish the API module had a series of doctests, similar to how the wallet APIs.
- A lot of manual testing thus required before this can go live. However, despite our automated testing deficiencies, there will be much manual testing over the next few months, (particularly with ongoing PIBD work,) so there is plenty of time to shake out any issues this might introduce.
A few comments throughout, but nothing major I could find, with the caveat that there's a lot here to digest so that's no guarantee there aren't any issues..
Thanks for doing all of this.. A lot of fiddly work but it should hopefully make future work less confusing for myself and anyone else who picks up any work on this.
let header_pmmr = self.header_pmmr.read(); | ||
let txhashset = self.txhashset.read(); | ||
txhashset::utxo_view(&header_pmmr, &txhashset, |utxo, _| { | ||
utxo.get_unspent_output_at(pos) | ||
utxo.get_unspent_output_at(pos0) | ||
}) | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(There is a lot to review here, so some of this comments will be more about me keeping track of what I've looked at than feedback)
Looked at all uses of get_unspent_output_at
in chain.rs
, seems to be covered.
@@ -1272,7 +1272,7 @@ impl Chain { | |||
let txhashset = self.txhashset.read(); | |||
let last_index = match max_pmmr_index { | |||
Some(i) => i, | |||
None => txhashset.highest_output_insertion_index(), | |||
None => txhashset.output_mmr_size(), | |||
}; | |||
let outputs = txhashset.outputs_by_pmmr_index(start_index, max_count, max_pmmr_index); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we change the name of this to outputs_by_pmmr_pos
and change the names of the arguments to be start_pos
, max_pos
. Same with rangeproof_by_pmmr_index
since we settled on terminology, it's even more confusing now when they're mixed up. We should be scrubbing the term 'pmmr index' wherever it appears anywhere (unless it's in an external API)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the pmmr in outputs_by_pmmr_pos is redundant. I personally like leaf index vs node index. And then we can use position as a one word alternative to node index.
@@ -1301,13 +1301,14 @@ impl Chain { | |||
None => self.head_header()?.height, | |||
}; | |||
// Return headers at the given heights |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
block_height_range_to_pmmr_positions
for this fn
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd go with either block_height_range_to_node_indices or block_height_range_to_positions.
.output_mmr_size + 1 | ||
}; | ||
let end_mmr_size = self.get_header_by_height(end_block_height)?.output_mmr_size; | ||
Ok((start_mmr_size, end_mmr_size)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
checked logic change seems okay
@@ -92,6 +92,7 @@ impl BitmapAccumulator { | |||
let mut idx_iter = idx.into_iter().filter(|&x| x < size).peekable(); | |||
while let Some(x) = idx_iter.peek() { | |||
if *x < chunk_idx * 1024 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tangental, but we should probably use BitmapChunk::LEN_BITS here instead of 1024 for clarity
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, will change
@@ -28,7 +28,7 @@ where | |||
B: Backend<T>, | |||
{ | |||
/// The last position in the PMMR | |||
last_pos: u64, | |||
size: u64, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if there's potential for confusion here now.. e.g. the last pos in a zero-indexed mmr with 4 leaves is 6 but its size is 7 (they were the same in a 1-indexed pmmr). Is there an implicit conversion somewhere when this is called from a higher up (1-based) API
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like I said above, we now equate size with next available position.
We can't be troubled by implicit conversion since using a size as position requires calling one of the affected methods.
// I don't see how 1 + 0-based index corresponds to a completed size | ||
// E.g. output_mmr_count==3 gives size 5 | ||
block.header.output_mmr_size = 1 + pmmr::insertion_to_pmmr_index(prev.output_mmr_count()); | ||
block.header.kernel_mmr_size = 1 + pmmr::insertion_to_pmmr_index(prev.kernel_mmr_count()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is clearly wrong; (output_mmr_count==3 size should be 7 with last position being 6). probably want to check why this is passing at some stage, possibly the test data is constructed so that outputs being appended only result in odd numbers of leaves? Also feeling like size != position could cause more ambiguity reading the code in places, but that might just be me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current code works correctly if and only if output_mmr_count is even. An even leaf index corresponds to a left-child, so the size of the mmr after its addition is just its node index + 1.
Ok; here's the WEIRDEST thing: fixing the code amount to keeping the original code:-)
Although non-sensical for 1-based positions, that code was actually correct for 0-base positions!
size is position, but it's the position of the next node to be added, rather than the last node added.
this has the advantage that it also works for an empty MMR, where the old rule had an exception.
let elmt_pos = self.last_pos + 1; | ||
let mut current_hash = elmt.hash_with_index(elmt_pos - 1); | ||
pub fn push(&mut self, leaf: &T) -> Result<u64, String> { | ||
let leaf_pos = self.size; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, think I'm seeing it now, thinking about sizes in terms of size n containing indices 0..n-1, so when you're adding something to the mmr you're simply using index 'size' (as you would with any array) instead of 'last_pos' + 1 in the 1-indexed world. Maybe there was some thought earlier that 1 indexing meant last_position was always equal to size and therefore easier to reason about? (Just thinking aloud, doesn't matter).
core/src/core/pmmr/pmmr.rs
Outdated
/// is defined by size. | ||
/// NOTE this function and the Path struct is UNUSED and | ||
/// mostly redundant as family_branch has similar functionality | ||
pub fn path(pos0: u64, size: u64) -> impl Iterator<Item = u64> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could probably just remove them
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok; will do
pub fn includes(&self, pos: u64) -> bool { | ||
self.bitmap.contains(pos as u32) | ||
pub fn includes(&self, pos0: u64) -> bool { | ||
self.bitmap.contains(1 + pos0 as u32) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maddening not being able to change the bitmap indexing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My thoughts exactly:-!
We can change the bitmap indexing, but then we need to put in stubs for the bitmap reading and writing functions to do the conversion, at some runtime cost...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, and also meant to add that we should probably also manually test this against a set of chain data created by the last release version of grin.
name: fixmmr_part2
about: Pull Request checklist
title: 'Fixmmr part2
labels: ''
assignees: ''
The MMR (Merkle Mountain Range) was originally designed with 0-based leaf and node indices.
This is reflected in the consensus model since hashes are computed with a 0-based node index prepended to the leaf data or the two child hashes.
0-based indices also simplify conversion of a leaf to a node index, which becomes
2 * leaf_index - count_ones(leaf_index).
The other C++ Grin code base, grinplusplus, uses 0-bases indices exclusively.
Unfortunately, 1-based indices have crept into the Grin code base resulting in a mix of both.
Use of 1-based indices has the downside that we often need to test that an index argument isn't 0.
This PR tries to convert as many uses of 1-based indices as possible to 0-based indices.
A few instances remain in storage related code where bitmaps and leaf sets are read from and written to disk.
The changes are not consensus breaking.
Each changed method is in its own commit, except for some pairs of related methods.
Tested with a regular sync as well as a sync from scratch.