Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

MMR Proof Generation is broken #11984

Closed
2 tasks done
Tracked by #1636
acatangiu opened this issue Aug 5, 2022 · 7 comments
Closed
2 tasks done
Tracked by #1636

MMR Proof Generation is broken #11984

acatangiu opened this issue Aug 5, 2022 · 7 comments
Assignees
Labels
I3-bug The node fails to follow expected behavior. U1-asap No need to stop dead in your tracks, however issue should be addressed as soon as possible. Z3-substantial Can be fixed by an experienced coder with a working knowledge of the codebase.

Comments

@acatangiu
Copy link
Contributor

Is there an existing issue?

  • I have searched the existing issues

Experiencing problems? Have you tried our Stack Exchange first?

  • This is not a support question.

Description of bug

Shortly after deploying polkadot:v0.9.27, Rococo fails to generate MMR proofs using mmr RPC, for example on block 1353786:

curl -H "Content-Type: application/json" -d '{"id":1, "jsonrpc":"2.0", "method": "mmr_generateProof", "params":[1353785, "0xd5d732c6ce0cb177ae45bbeecaa97fc8e09b2d03aac2c2a594d8ee367b34dea1"]}' https://rococo-rpc.polkadot.io | jq .
{
  "jsonrpc": "2.0",
  "error": {
    "code": 8012,
    "message": "Error while generating the proof",
    "data": "Error::GenerateProof"
  },
  "id": 1
}

Steps to reproduce

  1. Open polkadot.js/apps
  2. Connect to Rococo
  3. Go Developer -> RPC Calls -> mmr -> generateProof
  4. Call it with leafIndex =

It will produce error:

8012: Error while generating the proof: Error::GenerateProof
@acatangiu acatangiu added I3-bug The node fails to follow expected behavior. U1-asap No need to stop dead in your tracks, however issue should be addressed as soon as possible. Z3-substantial Can be fixed by an experienced coder with a working knowledge of the codebase. labels Aug 5, 2022
@acatangiu acatangiu added this to BEEFY Aug 5, 2022
@acatangiu
Copy link
Contributor Author

ckb-merkle-mountain-range-0.3.2 used by pallet-mmr is throwing InconsistentStore error when trying to generate proofs:

2022-08-05 13:50:45.623 ERROR tokio-runtime-worker runtime::mmr: [<wasm:stripped>] MMR error: InconsistentStore

Need to dive deeper for root cause, but likely culprit is recent PR #11594

@Lederstrumpf
Copy link
Contributor

Need to dive deeper for root cause, but likely culprit is recent PR #11594

Maybe I'm missing something, but I don't think that's the cause since the onchain runtime version on rococo is still 9250, which barely precedes inclusion of that PR.

ckb-merkle-mountain-range-0.3.2 used by pallet-mmr is throwing InconsistentStore error when trying to generate proofs

I've tried this for a couple thousand blocks, and the instantiation the InconsistentStore error is consistently this line: https://github.com/nervosnetwork/merkle-mountain-range/blob/master/src/mmr.rs#L125

I've attached a tracelog of a call of mmr_generateProof with params [1394039, 0xdf388aada7be7d84bd5f9834b6480d2cd6afa6c298f12e9ff8fe9754a5c8e5e2]

rococo_generateProof_1394039_0xdf388aada7be7d84bd5f9834b6480d2cd6afa6c298f12e9ff8fe9754a5c8e5e2.log

@acatangiu
Copy link
Contributor Author

Running a node from genesis doesn't exhibit this problem, only nodes started from snapshot have this broken db issue.

Unfortunately we only have snapshots from the last 5 days and they are all broken - all current nodes have been semi-recently upgraded and restarted from a broken snapshot, and so we have no older good snapshot.

@acatangiu
Copy link
Contributor Author

acatangiu commented Aug 9, 2022

The problem is that the offchainDB is incomplete, it holds no entries for old leaves (such as leaf indexes 0, 1, 3, 1000000); only for leaves pertaining to blocks added after node was started from snapshot.

This is most likely due to the node being started from an incomplete snapshot. The incomplete snapshot was most likely generated on a node without --enable-offchain-indexing=true parameter, and thus the node had no offchain db entries for MMR leaves.

To fix this, it should be enough to restart nodes from a good snapshot - a snapshot generated from an archive node running with --enable-offchain-indexing=true.

@acatangiu
Copy link
Contributor Author

acatangiu commented Aug 9, 2022

The Rococo sync nodes have been restarted with the correct flags and are currently re-syncing.

@acatangiu acatangiu moved this to In Progress 🛠 in BEEFY Aug 9, 2022
@Lederstrumpf
Copy link
Contributor

Lederstrumpf commented Aug 10, 2022

Running a node from genesis doesn't exhibit this problem, only nodes started from snapshot have this broken db issue.

Can report that it's working on rococo for me too when syncing from scratch with --enable-offchain-indexing=true enabled from the beginning.

The problem is that the offchainDB is incomplete, it holds no entries for old leaves (such as leaf indexes 0, 1, 3, 1000000); only for leaves pertaining to blocks added after node was started from snapshot.

This is most likely due to the node being started from an incomplete snapshot. The incomplete snapshot was most likely generated on a node without --enable-offchain-indexing=true parameter, and thus the node had no offchain db entries for MMR leaves.

In case we run into this again, I've debugged with a local chain to confirm that the offchainDB can actually be incomplete, so long as all leaves required in the proof are present.

Using a local chain where I:

  • enabled indexing until block 35,
  • disabled it until block 70, and then
  • reenabled until block 268,

i.e. with entries for leaves 36-69 missing in the offchainDB, these are the failure/success scenarios I get:


(feel free to skip remainder of this comment - just keeping a detailed record in case the proof generation breaks again)
(here's an archive of the associated chain state: issue-11984-interrupted-indexing-state.tar.gz)


A. leaf_index ≤ 36

For mmr_generateProof called for leaves 0-35, it works as long as the mmr_size is at most 35, since otherwise the leaf's copath to the root contains at least one leaf with index 36-69.

mmr_size ≤ 35 (leaf indexed and full path available)

block_height=35
method="chain_getBlockHash"
curl -H "Content-Type: application/json" -d '{"id":1, "jsonrpc":"2.0", "method": "'method'", "params":['$block_height']}' 65.108.96.98:9934 | jq .result
"0x642476aaddbbc65dc589cc2d801d8973dc5a394698c9860e17788531ab77dc36"

leaf_index=14;
block_35_hash="0x642476aaddbbc65dc589cc2d801d8973dc5a394698c9860e17788531ab77dc36"
method="mmr_generateProof"
curl -H "Content-Type: application/json" -d '{"id":1, "jsonrpc":"2.0", "method": "'$method'", "params":['$leaf_index', "'$block_35_hash'"]}' 65.108.96.98:9934 | jq .result.proof
"0x0e00000000000000230000000000000018b4c9b914a645c24e6d056edf8f905bb6ae80078aadc5b9ea77c2ba4ef45c835afa0360a215717f7b37aae3f96b4d38166bec3ddb412d7e9aa5ae2867783344f1ed33d4c3b7e34902581e2870bb04884b2aabecdc280e73e796aed8480d16391f1d259475eb1383cb48989a575e6ed9c83e5289c9647309217c00b825a2da763a04de585a62641e1ee66573e8e02c6fcad042cc0c9e27e3991ee3b3917cddb5ce239bbe1cc0ab39c0cf00c334a54765faeaeae0740ee531751d432044d7da19d6"

mmr_size > 35 (leaf indexed but full path not available)

block_height=200
method="chain_getBlockHash"
curl -H "Content-Type: application/json" -d '{"id":1, "jsonrpc":"2.0", "method": "'method'", "params":['$block_height']}' 65.108.96.98:9934 | jq .result
"0xb85ffec3f302c7b4fc7f6e7b59bdd248afa880da35384e1acf1329bbeb586209"

leaf_index=14;
block_200_hash="0xb85ffec3f302c7b4fc7f6e7b59bdd248afa880da35384e1acf1329bbeb586209"
method="mmr_generateProof"
curl -H "Content-Type: application/json" -d '{"id":1, "jsonrpc":"2.0", "method": "'$method'", "params":['$leaf_index', "'$block_200_hash'"]}' 65.108.96.98:9934 | jq .error
{
  "code": 8012,
  "message": "Error while generating the proof",
  "data": "Error::GenerateProof"
}

B. leaf_index > 35

35 < leaf_index ≤ 127 (leaf indexed but full path not available)

mmr_generateProof always fails since the leaf's copath to the root contains at least one leaf with index 36-69.

leaf_index=126;
block_200_hash="0xb85ffec3f302c7b4fc7f6e7b59bdd248afa880da35384e1acf1329bbeb586209"
method="mmr_generateProof"
curl -H "Content-Type: application/json" -d '{"id":1, "jsonrpc":"2.0", "method": "'$method'", "params":['$leaf_index', "'$block_200_hash'"]}' 65.108.96.98:9934 | jq .error
{
  "code": 8012,
  "message": "Error while generating the proof",
  "data": "Error::GenerateProof"
}

leaf_index > 127 (leaf indexed and full path available)

Despite the db being incomplete, mmr_generateProof succeeds since the leftmost peak is now 128, not 32 or 64, so all leaves required on the copath can be found.

leaf_index=128;
block_200_hash="0xb85ffec3f302c7b4fc7f6e7b59bdd248afa880da35384e1acf1329bbeb586209"
method="mmr_generateProof"
curl -H "Content-Type: application/json" -d '{"id":1, "jsonrpc":"2.0", "method": "'$method'", "params":['$leaf_index', "'$block_200_hash'"]}' 65.108.96.98:9934 | jq .
{
  "jsonrpc": "2.0",
  "result": {
    "blockHash": "0xb85ffec3f302c7b4fc7f6e7b59bdd248afa880da35384e1acf1329bbeb586209",
    "leaf": "0xc5010080000000a1425cc134f8019e8d8afcab8d1adc0328a90c5d84c359b7f448aa81d924c7c40f0000000000000002000000697ea2a8fe5b03468548a7a413424a6292ab44a82a6f5cc594c3fa7dda7ce4020000000000000000000000000000000000000000000000000000000000000000",
    "proof": "0x8000000000000000c80000000000000020b68f2eecca814fc9eca34ae08f78824a15413275e3511ac22b7f294a95723e356eb85b6931870b476750cc14799a0ff6c2c8d570f2d9953f2256ac117b7813bec36d9f2084db0961c915c75ac61754cbd7e8959e7666d16ffd21a18819f9024f5a11883fdd7cf413a2b7c532e1923432ce5eff73a717792d47d59e366532fe404f3a59caa48cdd3694b0eb73dc73c168702fd67f11efbf159064ca6b522e54da03b22c78eda892112faa3d0b76479b1d82aff960db06576b1d8fd82eae1b461fa8815288087eb7c93e88c1a94fa0400b7081db364240eba8cdd14f0aed2c4f824477fd54800ffd8e8063fc904fc05db631af30c06f1ff8ad54e1e60570bf2f40"
  },
  "id": 1
}

leaf_index=199;
block_200_hash="0xb85ffec3f302c7b4fc7f6e7b59bdd248afa880da35384e1acf1329bbeb586209"
method="mmr_generateProof"
curl -H "Content-Type: application/json" -d '{"id":1, "jsonrpc":"2.0", "method": "'$method'", "params":['$leaf_index', "'$block_200_hash'"]}' 65.108.96.98:9934 | jq .
{
  "jsonrpc": "2.0",
  "result": {
    "blockHash": "0xb85ffec3f302c7b4fc7f6e7b59bdd248afa880da35384e1acf1329bbeb586209",
    "leaf": "0xc50100c70000005336e481dc38c2b8e797db169518d9874003cd20a5d27efaa8d28a1d5feb72be160000000000000002000000697ea2a8fe5b03468548a7a413424a6292ab44a82a6f5cc594c3fa7dda7ce4020000000000000000000000000000000000000000000000000000000000000000",
    "proof": "0xc700000000000000c80000000000000014b68f2eecca814fc9eca34ae08f78824a15413275e3511ac22b7f294a95723e352908fa2820c73b598acfc3a27405f24b22434e6f711143d173b7c71e1745ff16aba15c8f91f01c2286946e6a1fe933675d53b33469a104354015b0bca21674fe393d8cdf476e35f19f13136779767cfdc4d66c2f6244706326eb8666f9b233efe93552b2d7c4793d3afe172b3f82a817b15ff535f9af7919572f0346d496bff0"
  },
  "id": 1
}

The equivalent of this last case for the broken rococo snapshot would be that if we started indexing again from the next power of two after the current block height, so 2^21, mmr_generateProof would work for all proofs for blocks after that one again. But we don't have time to test that ;)

@acatangiu
Copy link
Contributor Author

Fixed by redeploying RPC nodes using correct snapshot.

MMR Proof generation RPC on Rococo:

curl -H "Content-Type: application/json" -d '{"id":1, "jsonrpc":"2.0", "method": "mmr_generateProof", "params":[1353785, "0xd5d732c6ce0cb177ae45bbeecaa97fc8e09b2d03aac2c2a594d8ee367b34dea1"]}' https://rococo-rpc.polkadot.io | jq .

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  1127  100   981  100   146   3483    518 --:--:-- --:--:-- --:--:--  4010
{
  "jsonrpc": "2.0",
  "result": {
    "blockHash": "0xd5d732c6ce0cb177ae45bbeecaa97fc8e09b2d03aac2c2a594d8ee367b34dea1",
    "leaf": "0xc5010039a814005f855fb7851475b63af735b0fb3a9eda6c4cbb6f1c482a1c7d90c034cdbb63e6550900000000000044000000cbc76b9030a4afa18b157d3303ce4ccaf877cb371411e0187de90864ecdb3c263a847d778fa9abf061ad8c734feb2e04f549d15926e6490089903541d7cb0bd2",
    "proof": "0x39a81400000000003aa814000000000024fe578692f7cfefa34f7730676e65134832fed08c7df597583f401e37e4a6bbe91162cf5e4744a77c16f18da1872b62db903e9c67b080cec56960f20a39efe475e1ffc65d44f9095b659ba2d097355a04854f1db91f7861c3884322b798d3bbceeff67024d568f564a1b32cbf62b067f70462615ba39aa453866b8f4ea85a40b06fd88ceb18b50b6772200923bd1908ef0b9ab731157bb56f009d8c0fc9308892a05d055616a7262fd5ffd75aa0a1d23caf8e8c9581e2b011adca4c41f307c991a9452f153b0e63b20cee147dc0355ab3d0a22e0d8b888a8a2c7173b2e44cacb8fa41cc0bc98d1032e9860ae4368e6021a5e63bea28097c6fe230732c27ee26b1a0a64d53f135274681e94b17de116df3438489c46f117e3297975cee394e0a48"
  },
  "id": 1
}

Repository owner moved this from In Progress 🛠 to Done ✅ in BEEFY Aug 17, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
I3-bug The node fails to follow expected behavior. U1-asap No need to stop dead in your tracks, however issue should be addressed as soon as possible. Z3-substantial Can be fixed by an experienced coder with a working knowledge of the codebase.
Projects
No open projects
Status: Done
Development

No branches or pull requests

2 participants