Skip to content
This repository has been archived by the owner on Feb 27, 2023. It is now read-only.

Automatic removal of orphan nodes #36

Closed

Conversation

roysc
Copy link
Contributor

@roysc roysc commented Jun 18, 2021

Completes work from #24 :

  • implement the actual removing of orphans
  • more extensive testing (update/delete operations, tree with multiple entries)

sha256(testKey)  = 0001...
sha256(testKey2) = 1000...
sha256(foo)      = 0010...
no case produces nil sideNodes; nothing modifies the stored hashes
avoids ambiguity when pruning values
@musalbas
Copy link
Member

Thanks for continuing this work and this PR!

Changing the leaf value (341c998) seems to a breaking change. Is this necessary to implementing orphan removal, and why? There already shouldn't be any ambiguity about which key a leaf is for, as it's already prefixed in the leaf's value.

@roysc
Copy link
Contributor Author

roysc commented Jun 18, 2021

@musalbas sorry, my commit message wasn't very clear. The problem only arises when deleting a leaf with a duplicate value - there's no way to know if a value is still referenced by another leaf. Another solution would be some kind of reference count.

(this test case fails when using just hash(value) - https://github.com/vulcanize/smt/blob/341c9985ac9bf4b32561c6f9ba94f71c545789af/smt_test.go#L633)

@adlerjohn
Copy link
Member

adlerjohn commented Jun 19, 2021

@roysc The implementation storing the hash of the value for the leaf value is an implementation detail. It's in no way required. You could instead store values in the separate map hash(key) -> value, as in #14 (comment) (or better yet in the separate map key -> value). Breaking the commitment specification to workaround a failing test is not the proper solution.

I suggest you start by refactoring as per #14, then pruning will follow readily. Alternatively, implement the reference counting scheme.(Actually don't bother implementing the reference counting scheme because it'll be exceedingly wasteful compared to the direct mapping of key -> value which is needed anyways for constant-time reads).

Feel free to re-open this PR or a new one if you can get pruning working without modifying the current test suite (unless there's a bug in them of course).

@adlerjohn adlerjohn closed this Jun 19, 2021
@roysc
Copy link
Contributor Author

roysc commented Jun 19, 2021

@adlerjohn my apologies for not respecting the commitment spec. Using key -> value is a simple change; but if this mapping is stored in the same DB, would there be any concern about malicious users intentionally storing a plain key that collides with an existing tree hash? Or are you suggesting using a separate DB instance for this?

And could you clarify what you mean by modifying the current test suite? Are you referring only to the changes in the commitment spec, or changes in the LazyLedger/Celestia core? I haven't modified any existing tests here.

@adlerjohn
Copy link
Member

adlerjohn commented Jun 19, 2021

Or are you suggesting using a separate DB instance for this?

Yes, the suggestion is a separate KV store just for values. There's no inherent reason to put both values and nodes in the same KV store.

And could you clarify what you mean by modifying the current test suite? Are you referring only to the changes in the commitment spec

Yes, to clarify I mean that the commitment as specified here must be the same before and after any PR dealing with implementation details.

@roysc
Copy link
Contributor Author

roysc commented Jun 20, 2021

I'm not able to reopen this PR, new one: #37

@musalbas musalbas mentioned this pull request Jun 21, 2021
5 tasks
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants