-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add prehash_compare_key
to allow proving nonexistence in sparse trees
#136
feat: Add prehash_compare_key
to allow proving nonexistence in sparse trees
#136
Conversation
prehash_compare_key indicates whether to compare the keys lexicographically according to their _hashed_ values (implied by the hash function given by prehash_key). This is required for nonexistence proofs in proof specs that use prehashing.
prehash_compare_key
to allow proving nonexistence in sparse treesprehash_compare_key
to allow proving nonexistence in sparse trees
Thank you for including the justifications to include this change over #88, all of them make sense to me. Will review today! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed with this direction!! But it looks like this PR is incomplete?
My understanding is that the key in ExistenceProof.Key
will still be the preimage, hence why you also hash the rightKey and leftKey before comparison.
However, you are not hashing the passed-in key for existence proofs before comparison?
There is a check on ics23.go that is changed in #88 that seems like it would also need to be changed here.
473d9a5#diff-39a55415fc38a90b85c05989f293fda3b7ee126010cfe63838fd9b8441e47ed1R39
473d9a5#diff-39a55415fc38a90b85c05989f293fda3b7ee126010cfe63838fd9b8441e47ed1R59
I've also suggested a different naming for the field that at least me and Colin found more intuitive
proto/cosmos/ics23/v1/proofs.proto
Outdated
// prehash_compare_key is a flag that indicates whether to use the | ||
// prehash_key specified by LeafOp to compare lexical ordering of keys for | ||
// non-existence proofs. | ||
bool prehash_compare_key = 5; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a clearer name might be helpful here similar to our review of #88.
So effectively the database is providing an interface to the application: store.Set(key, value)
and underneath the hood it is merklizing this pair by storing the hash of the application-provided key as the key in the merkle tree.
Thus, we liked the nomenclature: appKey
and treeKey
.
The appKey
is the key known to the application, the treeKey
is the key stored in the tree.
In SMT, the treeKey is the hash of the appKey, while in iAVL they are the same.
So I think prehash_app_key
would be clearer here as a name. And we can use the docs here to explain there may be a difference between appKey and treeKey for some trees
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit of a bikeshed, but if there's going to be a distinction between app_key
and tree_key
, wouldn't it be even clearer to call the field compare_tree_key
? Then compare_tree_key = false
means it compares the app key, and compare_tree_key = true
means it compares the tree key.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok now that I understand more of this field.
Perhaps we can call it: prehash_key_before_comparison
The way I read the current field name at the moment, I expect it to be a HashOp type not a boolean.
cc: @colin-axner
@@ -204,6 +204,14 @@ func (p *ExistenceProof) CheckAgainstSpec(spec *ProofSpec) error { | |||
return nil | |||
} | |||
|
|||
func keyForComparison(spec *ProofSpec, key []byte) []byte { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You also will have to use this function on getExistProofForKey
and getNonexistProofForKey
correct?
473d9a5#diff-39a55415fc38a90b85c05989f293fda3b7ee126010cfe63838fd9b8441e47ed1R39
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, good catch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but only on getNonexistProof
, because the lexical comparison is only relevant for non-existence proofs
Thanks for the careful review @AdityaSripal!
In existence proofs, the only comparison is for equality, not for lexical ordering, which means that comparing either the preimage of the hashed key, or the hash, will work equally well (because we can assume that By contrast, nonexistence proofs require lexical ordering comparison on keys (or their hashes), which this PR implements, for nonexistence proofs only.
Only one of the two links is relevant to the changes we propose here: we do not change anything about existence proofs, so the change noted from #88 to existence proofs is not relevant:
In regards to the second piece, the below line of code adds an extra layer of hashing to the key at the top level, which is not necessary or sufficient to verify nonexistence proofs:
Instead, our implementation exhibits the same prehashing behavior in Go as the original Rust implementation: it changes the lexical ordering comparisons
We're fine with whatever naming works well for you, provided it is well-documented. We are partial to A high-level note on where some of this confusion may originate: there are two notions of "prehashed key comparison", as below:
It's worth keeping in mind that you can still add something like #88 on top of this change, but it's not necessary to do so in order to unblock the ability to use ICS23 nonexistence proofs with sparse merkle trees, just as you already can correctly use ICS23 existence proofs with sparse merkle trees. We suspect that #88 is not as useful once this PR is merged, because we think (but are not certain) that its originating motivation came from an attempt to work around the very bug that this PR resolves. |
proto/cosmos/ics23/v1/proofs.proto
Outdated
// prehash_compare_key is a flag that indicates whether to use the | ||
// prehash_key specified by LeafOp to compare lexical ordering of keys for | ||
// non-existence proofs. | ||
bool prehash_compare_key = 5; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok now that I understand more of this field.
Perhaps we can call it: prehash_key_before_comparison
The way I read the current field name at the moment, I expect it to be a HashOp type not a boolean.
cc: @colin-axner
if !spec.PrehashCompareKey { | ||
return key | ||
} | ||
hash, _ := doHashOrNoop(spec.LeafSpec.PrehashKey, key) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is using the PrehashKey defined in the leafspec and the key in the existenceProof is supposed to now be unhashed for SMT proofs, I think the SMT spec needs to be changed to have PrehashKey=SHA_256
instead of NO_HASH
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the SMT should mirror the JMT's proof spec:
https://github.com/penumbra-zone/jmt/blob/main/src/tree/ics23_impl.rs#L219
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right so the SMT proof spec in this repo need to be changed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code that generates the proofs will also need to be updated following the spec change
Backwards compatibility nonwithstanding, it seems like applications (if they can) should always set |
@plaidfinch Thanks for this! I can tell you guys have thought this through. I think that this solution should work for us as well. |
I don't see how this is true. If the key is being hashed in order to create the LeafHash, but the key is ordered lexicographically on the key itself (IAVL, Trie, etc), then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack pending test fixes.
Great work to everyone involved!!
Yes, the only reason to have a flag in this case is to maintain strict backwards-compatibility in my opinion. If you have Where the change would be breaking without the flag is in the case where you use |
I've updated this code to include the changes for the |
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## master #136 +/- ##
===========================================
+ Coverage 39.35% 50.54% +11.19%
===========================================
Files 16 23 +7
Lines 6286 8034 +1748
Branches 85 86 +1
===========================================
+ Hits 2474 4061 +1587
- Misses 3456 3616 +160
- Partials 356 357 +1
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 3 files with indirect coverage changes Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK for the concept and the go code changes. I like this solution a lot! Excellent work y'all :)
@AdityaSripal We've fixed the SMT proof spec in this PR, and we are happy to fix the SMT test vectors, but we are not sure how these test vectors are being generated. Could you shed some light on that so we can finish fixing the tests and push this PR over the finish line? As of now, we think we've tracked down how they're being generated to here. Can you confirm that this was the method used? |
i updated the smt test vectors to be correctly generated per this change, and they should verify now. i generated the test vectors using a modified version of the smt in the I believe that with the most recent changes, all the review comments have been addressed |
Hi yes, please make a PR to the SDK to update the proofgen code there |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wahoo! Fantastic work to everyone involved! 🎉
Nice job updating the test vectors! I wasn't sure either how the test data was generated
(approving for go changes)
Would it be possible to eventually have test data vectors from Penumbra's JMT? It appears the referenced code generation has been removed from the SDK and I'm not sure the SMT implementation is being maintained |
We could make some test vectors, yeah. If we match the ad-hoc JSON serialization format used by the SMT generation code, then I think it'd be as simple as checking proofs against them using the JMT spec instead of the SMT spec. Would you want these test vectors included in this PR before merging it, or should we make a separate PR with them? It'd require a bit of implementation work to add the code to the JMT to make it spit out the test vectors, so my personal preference would be to merge this PR and make a separate PR later to swap in the JMT test vectors for the SMT ones. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rust version looks good! I just left one question regarding the removal of the derived Eq
instance and a small nit, feel free to ignore.
Let's do a separate pr 👍 I don't see a rush in adding the test vectors, just want to make sure that long term ics23 has an up to date testing framework/tests 😄 |
The failing check appears to be a spurious service error for the Go code coverage tool. Could someone re-run it please? If I understand correctly, this branch is ready to be merged, and a new release cut, at this point. Anything else we can help out with? |
All good on the Rust side! I can do a release of the Rust crate on Tuesday if nobody beats me to it. |
Fantastic! Thanks all for your help bringing this to the finish line! 🎉 |
Hey, just checking in on this -- are we still good to merge this PR and cut a release? |
Support for SMTs is going to go a long way. Thanks to everyone involved here! |
Motivation
There are several important merkle trees that hash keys and sparsely store values at positions indicated by the value of the hash, such as the Cosmos Sparse Merkle Tree and Libra/Diem/Aptos/Penumbra's Jellyfish Merkle Tree. We would like this type of tree to be compatible with ICS23.
For existence proofs, this is already the case; however, currently, when verifying a ICS23 non-existence proof, keys are compared based on the preimage of the hash function used to prehash them, even when
prehash_key
is set in theProofSpec
. This means that when aProofSpec
usesprehash_key
, nonexistence proofs cannot be verified even if a client correctly generates them as per the "spirit" of the ICS23 nonexistence proof format. We believe that this is a bug, but this change is framed backwards-compatibly, as noted below, so that it can be applied universally without friction.See here for other discussion of the impact of this issue:
Summary of changes
This PR introduces one single boolean new parameter to the top-level
ProofSpec
:prehash_compare_key
. When set totrue
, this flag causes keys to be consistently compared lexicographically according to their hashes within nonexistence proof verification, using the same hash function as specified by the already-extantprehash_key
field.This is intended as an alternative to #88
We believe this change is the minimum necessary to unblock nonexistence proofs on key-prehashed structures. #88 also attempts to solve this problem, namely that for nonexistence proofs only, any tree that uses prehashing cannot prove nonexistence because the nonexistence proof verifier compares the keys lexically by their preimage, ignoring the
prehash_key
field of the specification.However, as currently implemented, #88 does not accomplish this goal. There are 3 issues with #88 as a solution to this problem, which this PR addresses:
prehash_compared_value
field, which is not necessary to fix this specific issue, and like itsprehash_compared_key
field, this too is a specification of an arbitrary hash function, which for the same reasons above, we believe is overly general.Backwards-compatibility
This is a backwards-compatible change, as it requires opt-in via setting the
prehash_compare_key
flag totrue
in theProofSpec
. All existingProofSpec
s will continue to behave identically.Contents
This PR includes implementation for Rust, Go, and Typescript. We have tested it against our own implementation of the Jellyfish Merkle Tree and we believe it should be effective at addressing this issue across the ecosystem.
Thanks
We would like to thank everyone who participated in discussion and development of #88 and other work towards solutions to this issue. We feel very confident that this is the right way forward, but we want to ensure that our contribution does not make anyone feel like their work is unappreciated; to the contrary, the discussion and work leading up to this PR, by everyone, has been necessary to clarify our understanding of the issue. Thanks everyone for your help!