Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MemForest::prove can panic in certain cases #59

Closed
optout21 opened this issue Jan 12, 2025 · 5 comments · Fixed by #62
Closed

MemForest::prove can panic in certain cases #59

optout21 opened this issue Jan 12, 2025 · 5 comments · Fixed by #62

Comments

@optout21
Copy link

optout21 commented Jan 12, 2025

I ran into this issue while driving MemForest with some simulated data.

The code in question is:

    pub fn prove(&self, targets: &[Hash]) -> Result<Proof<Hash>, String> {
        let mut positions = Vec::new();
        for target in targets {
            let node = self.map.get(target).ok_or("Could not find node")?;
            let position = self.get_pos(node);
            positions.push(position);
        }
        let needed = get_proof_positions(&positions, self.leaves, tree_rows(self.leaves));
        let proof = needed
            .iter()
329:        .map(|pos| self.get_hash(*pos).unwrap())
            .collect::<Vec<_>>();

        Ok(Proof::new_with_hash(positions, proof))
    }

With certain input combinations (state), the inner unwrap() on line 329 causes a panic. That is, get_proof_positions() returns a position for which get_hash() returns None.

The inner error could be returned as a proper error, but that does not solve the conceptual problem.

I was not able to find the conceptual root cause, but I can reliably reproduce it. I post a sample cleaned-up Rust code to reproduce. Probably it could be reproduced in a simpler way, this program does on the order of 100 additions and some deletions, deterministically (always the same data).
The program produces some incorrect inputs, but that still should not cause a panic.

The attached sample program should be straighforward to run.

It is against current version ({ git = "https://github.com/mit-dci/rustreexo.git", rev = "1ee240f" }).
I observed similar behavior for v 0.3.0 (MemTree is Pollard there).

rustreexo-issue.zip

Some relevant output:

...
Pollar: modify, del 1 add 2
  i ccf6da9fa10dc3cafd575567e60fef5ca370e1107409a4c271ce11ed6ccde1c9
  o 63eb737a51d2cefc1649f61f27fb7f6f0d8bb55708362ec883f557c3942b4fe8
  o 63eb737a51d2cefc1649f61f27fb7f6f0d8bb55708362ec883f557c3942b4fe9
MemTree: connect_block, height 31, tx_cnt 2
Pollar: modify, del 0 add 1
  o 364bff7a2329bc5374da8d9328b5d939bf79d4ad214b5fa38f4a173314efee4f
Pollar: prove 1
  i 78335a6f34e81678beff4d91d4c62e97d04dbc5bed7e40ce310b3d0c9dd0a589
    leaves: 128
    positions: l=1   6 
    needed: l=7   7 130 192 225 241 249 253 
    NOT FOUND! 0 7  'node 7 not found'
    NOT FOUND! 1 130  'node 130 not found'
    NOT FOUND! 2 192  'node 192 not found'
    NOT FOUND! 3 225  'node 225 not found'
thread 'main' panicked at /home/optout/.cargo/git/checkouts/rustreexo-4f9d2e36391ebbc8/1ee240f/src/accumulator/mem_forest.rs:329:44:
called `Result::unwrap()` on an `Err` value: "node 7 not found"
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
@Davidson-Souza
Copy link
Collaborator

Thanks for reporting this @optout21 . I'll push a fix soon

@optout21
Copy link
Author

I've managed to add an extra check which prevents the unwrap() panic (see below), however, this only converts the panic into a regular error. However, the deeper issue is still there: it cannot find a hash which has been added previously (and has not been removed).

I suspect somehow the internal representation becomes inconsistent.

I'm not sure whether this issue affects the Go utreexo implementation as well, I was not able to check that. Maybe not, as the internal representation of the full accumulator is different.

    pub fn prove(&self, targets: &[Hash]) -> Result<Proof<Hash>, String> {
        let mut positions = Vec::new();
        for target in targets {
            let node = self.map.get(target).ok_or("Could not find node")?;
            let position = self.get_pos(node);

+           // Extra cautious double check, to avoid unwrap() panic below
+           if let Err(err) = self.get_hash(position) {
+               return Err(format!("Could not find returned pos, {}, {}", position, err));
+           }

            positions.push(position);
        }
        let needed = get_proof_positions(&positions, self.leaves, tree_rows(self.leaves));
        let proof = needed
            .iter()
329:        .map(|pos| self.get_hash(*pos).unwrap())
            .collect::<Vec<_>>();

        Ok(Proof::new_with_hash(positions, proof))
    }

@Davidson-Souza
Copy link
Collaborator

Davidson-Souza commented Jan 15, 2025

After a lot of debugging, the problem seems to originate in get_root_position. If our forest have only one tree and num_leaves equal to some very specific power of two (128, 512, 1024, 4096...), it seems to return num_leaves instead of the actual root position. It seems the go impl have the same problem (though I'd wait Calvin with a more reliable test). get_pos, because the loop isn't inclusive, if the root we are looking for is in the very last row, the loop won't reach it, and will proceed with 0 as the row. I'll update the loop and make sure we handle the case where we don't find the apropriate root

Btw, pretty cool way to test it. Would you mind if I reuse some of it to test rustreexo?

@optout21
Copy link
Author

Thanks for looking into this!

Of course you are free to use the test code!

I did this to test Utreexo library, by simulating blocks & transactions, using reproducible (pseudo)random numbers.
In the first attempt I did in an inefficient way: for a block, each transaction is processed one by one (inputs removed, outputs added). But that way I ran into this issue. Later I changed to process a block in one go: remove all inputs and add all outputs (if there are outputs which are consumed in the same block, they are filtered out). Interestingly, with processing transactions in bulk, this issue did not appear. Though I did not run for long.

For reproducing this particular problem, I thought of just recording the hashes added/deleted in the sample code, and generate a test just replaying those. But I didn't got to do that, and that would have been specific to this particular reproduction of this issue.

I am not sure how heavy usage MemForest gets 'in production', as Floresta uses only compressed accumulator, while currently I don't know of a Rust-based bridge (full node) Utreexo implementation (utreexod uses the Go library).

@optout21
Copy link
Author

Fix in get_pos confirmed 😎

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants