Skip to content
This repository has been archived by the owner on Feb 27, 2023. It is now read-only.

Implement removal of orphan nodes #37

Merged
4 changes: 2 additions & 2 deletions bulk_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -22,8 +22,8 @@ func TestSparseMerkleTree(t *testing.T) {

// Test all tree operations in bulk, with specified ratio probabilities of insert, update and delete.
func bulkOperations(t *testing.T, operations int, insert int, update int, delete int) {
sm := NewSimpleMap()
smt := NewSparseMerkleTree(sm, sha256.New())
smn, smv := NewSimpleMap(), NewSimpleMap()
smt := NewSparseMerkleTree(smn, smv, sha256.New())

max := insert + update + delete
kv := make(map[string]string)
Expand Down
23 changes: 12 additions & 11 deletions deepsubtree.go
Original file line number Diff line number Diff line change
Expand Up @@ -14,15 +14,10 @@ type DeepSparseMerkleSubTree struct {
}

// NewDeepSparseMerkleSubTree creates a new deep Sparse Merkle subtree on an empty MapStore.
func NewDeepSparseMerkleSubTree(ms MapStore, hasher hash.Hash, root []byte) *DeepSparseMerkleSubTree {
smt := &SparseMerkleTree{
th: *newTreeHasher(hasher),
ms: ms,
func NewDeepSparseMerkleSubTree(nodes, values MapStore, hasher hash.Hash, root []byte) *DeepSparseMerkleSubTree {
return &DeepSparseMerkleSubTree{
SparseMerkleTree: ImportSparseMerkleTree(nodes, values, hasher, root),
}

smt.SetRoot(root)

return &DeepSparseMerkleSubTree{SparseMerkleTree: smt}
}

// AddBranch adds a branch to the tree.
Expand All @@ -32,14 +27,20 @@ func NewDeepSparseMerkleSubTree(ms MapStore, hasher hash.Hash, root []byte) *Dee
// If the leaf may be updated (e.g. during a state transition fraud proof),
// an updatable proof should be used. See SparseMerkleTree.ProveUpdatable.
func (dsmst *DeepSparseMerkleSubTree) AddBranch(proof SparseMerkleProof, key []byte, value []byte) error {
result, updates := verifyProofWithUpdates(proof, dsmst.Root(), key, value, dsmst.th.hasher)
result, updates, valueHash := verifyProofWithUpdates(proof, dsmst.Root(), key, value, dsmst.th.hasher)
if !result {
return ErrBadProof
}

if valueHash != nil {
if err := dsmst.values.Set(valueKey(key, valueHash), value); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's the first use, but conceptually is there a reason to map hash(key, hash(value)) -> value? Why not just hash(key) -> value (i.e. path -> value)? Mapping path -> value allows retrieval to the value knowing only the key, which is necessary for e.g. constant-time reads.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With just key/path -> value, there is no way to retrieve values for past versions. I considered using root+key as @tzdybal suggested, but then lookup requires knowing the root at the value's last update. Seems we would need some sort of copy-on-write versioned backing store to make this efficient.

note - this is just key+hash(value) -> value, no extra hash.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that comment is outdated. Snapshotting and retrieving past version is outside the scope of this library. It can be accomplished at the db level. See https://github.com/cosmos/cosmos-sdk/blob/6a5a2de798d4e61302783ae84ecfe2958c7088d5/docs/architecture/adr-040-storage-and-smt-state-commitments.md.

Seems we would need some sort of copy-on-write versioned backing store to make this efficient.

Potentially yes, and that's an interesting direction, but as above it's outside the scope of this library. The caller can accomplish this by versioning the mapstores they pass in on SMT initialization.

TL;DR: use path -> value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The caller can accomplish this by versioning the mapstores they pass in on SMT initialization.

This work is actually in support of that ADR - we'll be doing exactly that, so I'm basically indifferent to the indexing used here. I just didn't want to introduce breakage.

If just path is used, supporting past value access will be dropped, but I assume proofs for non-pruned roots should still be supported? Otherwise pruning effectively becomes the only behavior.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If just path is used, supporting past value access will be dropped, but I assume proofs for non-pruned roots should still be supported? Otherwise pruning effectively becomes the only behavior.

Good point. Proofs against past versions of the tree isn't needed in this library, since that can be accomplished caller-side. So, pruning should be the only behavior (making the prune flag unnecessary).

Copy link
Member

@musalbas musalbas Jun 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Proofs against old roots is a test case specified in the current version of this library. If this feature is to be removed because it's considered out of scope, we should be sure that this can easily be replicated caller-side with a KV store with snapshot support. What KV store supports this exactly? I checked RocksDB, but its snapshot feature is basically copying the entire DB directory, which clearly isn't efficient.

Or would the idea be that the caller can implement versioning themselves on top of any efficient KV store like this or this? If so, this doesn't seem to be suggested in the ADR, which seems to assume that the underlying KV store natively supports snapshotting.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ADR states that the state commitment (i.e. the SMT, i.e. this library) isn't responsible for snapshots and historical queries, but rather the state storage is. Note that the state storage won't be a raw KV store, but will be a wrapper over one or more raw KV stores with additional logic.

State Storage requirements:

  • range queries
  • quick (key, value) access
  • creating a snapshot
  • historical versioning
  • pruning (garbage collection)

State Commitment requirements:

  • fast updates
  • tree path should be short
  • pruning (garbage collection)

Copy link
Member

@musalbas musalbas Jun 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While there has been some benchmarking done on Rocks DB etc snapshots for the ADR, this isn't a widely supported feature by KV stores, and it's yet to be seen how well this will work in practice. RocksDB snapshots seem to be designed with backups use case in mind; creating a new directory for every block could exhaust the inode limit on filesystems, given a chain with millions of blocks. BadgerDB seems to be more suitable as it's a key-value-version store, but this seems to be the only KV that supports this.

This library should act a general-purpose library. If someone wants to use this library for testing or research for example, they shouldn't have to setup complex extra scaffolding - or be tied to BadgerDB (the only KV store that seems to support efficient versioning) - to get the basic feature of accessing previous roots.

For flexibility, this library should ideally support the following two extremes:

  • Prune everything.
  • Prune nothing.

Anything in between could be implemented by the caller-supplied KV store; in the simple case, this could be delaying Del operations until e.g. 100 blocks. This would allow us to keep the ProveForRoot etc operations.

This could be implement by using a path -> value KV for this PR, and making pruning optional. The path -> value store (for efficient read ops), would only be for the latest version of the tree. However, the SMT would still need to store valueHash -> value in the nodes MapStore, to allow accessing value for old roots when pruning is disabled. When GetForRoot is called, it would get the leaf by traversing the tree directly, rather than using the path -> value store.

Copy link
Member

@musalbas musalbas Jun 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively: this library could support "prune everything" only, but keep storing valueHash -> value in the nodes MapStore (until pruned).

It would then be trivial for the caller to implement a "prune nothing" with MapStore scaffolding that simply does nothing when Del is called. GetForRoot and ProveForRoot can then still work. However, those functions would only work for callers that have used a MapStore with this hack.

This is probably the better / easiest option.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, actually I just realised that valueHash -> value can't be stored in either proposals, due to the ambiguity issue when pruning. So we need to think of another way to implement pruning while ideally allow for faster Get operations, without forcing people to use BadgerDB or implement their own versioning scheme, if they want to disable pruning.

return err
}
}

// Update nodes along branch
for _, update := range updates {
err := dsmst.ms.Set(update[0], update[1])
err := dsmst.nodes.Set(update[0], update[1])
if err != nil {
return err
}
Expand All @@ -48,7 +49,7 @@ func (dsmst *DeepSparseMerkleSubTree) AddBranch(proof SparseMerkleProof, key []b
// Update sibling node
if proof.SiblingData != nil {
if proof.SideNodes != nil && len(proof.SideNodes) > 0 {
err := dsmst.ms.Set(proof.SideNodes[0], proof.SiblingData)
err := dsmst.nodes.Set(proof.SideNodes[0], proof.SiblingData)
if err != nil {
return err
}
Expand Down
16 changes: 8 additions & 8 deletions deepsubtree_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -8,22 +8,22 @@ import (
)

func TestDeepSparseMerkleSubTreeBasic(t *testing.T) {
smt := NewSparseMerkleTree(NewSimpleMap(), sha256.New())
smt := NewSparseMerkleTree(NewSimpleMap(), NewSimpleMap(), sha256.New())

smt.Update([]byte("testKey1"), []byte("testValue1"))
smt.Update([]byte("testKey2"), []byte("testValue2"))
smt.Update([]byte("testKey3"), []byte("testValue3"))
smt.Update([]byte("testKey4"), []byte("testValue4"))
smt.Update([]byte("testKey6"), []byte("testValue6"))

var originalRoot []byte
originalRoot := make([]byte, len(smt.Root()))
copy(originalRoot, smt.Root())

proof1, _ := smt.ProveUpdatable([]byte("testKey1"))
proof2, _ := smt.ProveUpdatable([]byte("testKey2"))
proof5, _ := smt.ProveUpdatable([]byte("testKey5"))

dsmst := NewDeepSparseMerkleSubTree(NewSimpleMap(), sha256.New(), smt.Root())
dsmst := NewDeepSparseMerkleSubTree(NewSimpleMap(), NewSimpleMap(), sha256.New(), smt.Root())
err := dsmst.AddBranch(proof1, []byte("testKey1"), []byte("testValue1"))
if err != nil {
t.Errorf("returned error when adding branch to deep subtree: %v", err)
Expand All @@ -39,21 +39,21 @@ func TestDeepSparseMerkleSubTreeBasic(t *testing.T) {

value, err := dsmst.Get([]byte("testKey1"))
if err != nil {
t.Error("returned error when getting value in deep subtree")
t.Errorf("returned error when getting value in deep subtree: %v", err)
}
if !bytes.Equal(value, []byte("testValue1")) {
t.Error("did not get correct value in deep subtree")
}
value, err = dsmst.Get([]byte("testKey2"))
if err != nil {
t.Error("returned error when getting value in deep subtree")
t.Errorf("returned error when getting value in deep subtree: %v", err)
}
if !bytes.Equal(value, []byte("testValue2")) {
t.Error("did not get correct value in deep subtree")
}
value, err = dsmst.Get([]byte("testKey5"))
if err != nil {
t.Error("returned error when getting value in deep subtree")
t.Errorf("returned error when getting value in deep subtree: %v", err)
}
if !bytes.Equal(value, defaultValue) {
t.Error("did not get correct value in deep subtree")
Expand Down Expand Up @@ -120,7 +120,7 @@ func TestDeepSparseMerkleSubTreeBasic(t *testing.T) {
}

func TestDeepSparseMerkleSubTreeBadInput(t *testing.T) {
smt := NewSparseMerkleTree(NewSimpleMap(), sha256.New())
smt := NewSparseMerkleTree(NewSimpleMap(), NewSimpleMap(), sha256.New())

smt.Update([]byte("testKey1"), []byte("testValue1"))
smt.Update([]byte("testKey2"), []byte("testValue2"))
Expand All @@ -130,7 +130,7 @@ func TestDeepSparseMerkleSubTreeBadInput(t *testing.T) {
badProof, _ := smt.Prove([]byte("testKey1"))
badProof.SideNodes[0][0] = byte(0)

dsmst := NewDeepSparseMerkleSubTree(NewSimpleMap(), sha256.New(), smt.Root())
dsmst := NewDeepSparseMerkleSubTree(NewSimpleMap(), NewSimpleMap(), sha256.New(), smt.Root())
err := dsmst.AddBranch(badProof, []byte("testKey1"), []byte("testValue1"))
if !errors.Is(err, ErrBadProof) {
t.Error("did not return ErrBadProof for bad proof input")
Expand Down
19 changes: 8 additions & 11 deletions proofs.go
Original file line number Diff line number Diff line change
Expand Up @@ -103,19 +103,20 @@ func (proof *SparseCompactMerkleProof) sanityCheck(th *treeHasher) bool {

// VerifyProof verifies a Merkle proof.
func VerifyProof(proof SparseMerkleProof, root []byte, key []byte, value []byte, hasher hash.Hash) bool {
result, _ := verifyProofWithUpdates(proof, root, key, value, hasher)
result, _, _ := verifyProofWithUpdates(proof, root, key, value, hasher)
return result
}

func verifyProofWithUpdates(proof SparseMerkleProof, root []byte, key []byte, value []byte, hasher hash.Hash) (bool, [][][]byte) {
func verifyProofWithUpdates(proof SparseMerkleProof, root []byte, key []byte, value []byte, hasher hash.Hash) (bool, [][][]byte, []byte) {
th := newTreeHasher(hasher)
path := th.path(key)

if !proof.sanityCheck(th) {
return false, nil
return false, nil, nil
}

var updates [][][]byte
var memberValueHash []byte

// Determine what the leaf hash should be.
var currentHash, currentData []byte
Expand All @@ -126,7 +127,7 @@ func verifyProofWithUpdates(proof SparseMerkleProof, root []byte, key []byte, va
actualPath, valueHash := th.parseLeaf(proof.NonMembershipLeafData)
if bytes.Equal(actualPath, path) {
// This is not an unrelated leaf; non-membership proof failed.
return false, nil
return false, nil, nil
}
currentHash, currentData = th.digestLeaf(actualPath, valueHash)

Expand All @@ -135,13 +136,9 @@ func verifyProofWithUpdates(proof SparseMerkleProof, root []byte, key []byte, va
updates = append(updates, update)
}
} else { // Membership proof.
valueHash := th.digest(value)
memberValueHash = th.digest(value)
currentHash, currentData = th.digestLeaf(path, memberValueHash)
update := make([][]byte, 2)
update[0], update[1] = valueHash, value
updates = append(updates, update)

currentHash, currentData = th.digestLeaf(path, valueHash)
update = make([][]byte, 2)
update[0], update[1] = currentHash, currentData
updates = append(updates, update)
}
Expand All @@ -162,7 +159,7 @@ func verifyProofWithUpdates(proof SparseMerkleProof, root []byte, key []byte, va
updates = append(updates, update)
}

return bytes.Equal(currentHash, root), updates
return bytes.Equal(currentHash, root), updates, memberValueHash
adlerjohn marked this conversation as resolved.
Show resolved Hide resolved
}

// VerifyCompactProof verifies a compacted Merkle proof.
Expand Down
14 changes: 7 additions & 7 deletions proofs_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -10,15 +10,15 @@ import (

// Test base case Merkle proof operations.
func TestProofsBasic(t *testing.T) {
var sm *SimpleMap
var smn, smv *SimpleMap
var smt *SparseMerkleTree
var proof SparseMerkleProof
var result bool
var root []byte
var err error

sm = NewSimpleMap()
smt = NewSparseMerkleTree(sm, sha256.New())
smn, smv = NewSimpleMap(), NewSimpleMap()
smt = NewSparseMerkleTree(smn, smv, sha256.New())

// Generate and verify a proof on an empty key.
proof, err = smt.Prove([]byte("testKey3"))
Expand Down Expand Up @@ -123,8 +123,8 @@ func TestProofsBasic(t *testing.T) {

// Test sanity check cases for non-compact proofs.
func TestProofsSanityCheck(t *testing.T) {
sm := NewSimpleMap()
smt := NewSparseMerkleTree(sm, sha256.New())
smn, smv := NewSimpleMap(), NewSimpleMap()
smt := NewSparseMerkleTree(smn, smv, sha256.New())
th := &smt.th

smt.Update([]byte("testKey1"), []byte("testValue1"))
Expand Down Expand Up @@ -199,8 +199,8 @@ func TestProofsSanityCheck(t *testing.T) {

// Test sanity check cases for compact proofs.
func TestCompactProofsSanityCheck(t *testing.T) {
sm := NewSimpleMap()
smt := NewSparseMerkleTree(sm, sha256.New())
smn, smv := NewSimpleMap(), NewSimpleMap()
smt := NewSparseMerkleTree(smn, smv, sha256.New())
th := &smt.th

smt.Update([]byte("testKey1"), []byte("testValue1"))
Expand Down
Loading