Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chore(rln-relay): better documentation for rln nim bindings #1016

Merged
merged 11 commits into from
Jun 29, 2022
40 changes: 31 additions & 9 deletions waku/v2/protocol/waku_rln_relay/rln.nim
Original file line number Diff line number Diff line change
Expand Up @@ -23,44 +23,66 @@ elif defined(MacOsX):
type Buffer* = object
`ptr`*: ptr uint8
len*: uint

type Auth* = object
secret_buffer*: ptr Buffer
index*: uint

#------------------------------ Merkle Tree operations -----------------------------------------

proc update_next_member*(ctx: RLN[Bn256],
input_buffer: ptr Buffer): bool {.importc: "update_next_member".}
## input_buffer points to the id commitment byte seq
## the return bool value indicates the success or failure of the operation

proc delete_member*(ctx: RLN[Bn256], index: uint): bool {.importc: "delete_member".}
## index is the position of the id commitment key to be deleted from the tree
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would clarify that delete overwrite the leaf with the default 0 leaf and that other leaves indexes are kept unchanged.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in efdfad7

## the deleted id commitment key is replaced with a zero leaf
## the return bool value indicates the success or failure of the operation

proc get_root*(ctx: RLN[Bn256], output_buffer: ptr Buffer): bool {.importc: "get_root".}

## get_root populates the passed pointer output_buffer with the current tree root
## the output_buffer holds the Merkle tree root of size 32 bytes
## the return bool value indicates the success or failure of the operation
##
#----------------------------------------------------------------------------------------------
#-------------------------------- zkSNARKs operations -----------------------------------------

proc key_gen*(ctx: RLN[Bn256], keypair_buffer: ptr Buffer): bool {.importc: "key_gen".}
## generates id key and id commitment key serialized inside keypair_buffer as | id_key <32 bytes>| id_commitment_key <32 bytes> |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may add that id_commitment_key is the poseidon hash of id_key?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, done in efdfad7

## id commitment is the poseidon hash of the id key
## the return bool value indicates the success or failure of the operation

proc generate_proof*(ctx: RLN[Bn256],
input_buffer: ptr Buffer,
output_buffer: ptr Buffer): bool {.importc: "generate_proof".}
## input_buffer serialized as [ id_key<32> | id_index<8> | epoch<32> | signal_len<8> | signal<var> ]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why id_index is 8 bytes long? Aren't 4 enough? (32 bytes * 2^(8*4) = 128GiB of leaves).
I guess its based on the typical 8-bytes usize for most 64bits architecture, but over 32bits architectures such length won't make sense.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uint8 is the smallest among the integer types supported by Rust (and typically by other languages), which also appeared to be large enough for the leaf index. Of course, smaller than that also works but requires a custom type definition.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uint8 is 1 byte, while here id_index is 8 bytes (at least in comment). Did you mean something else, e.g. uint32/uint64?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, misinterpreted the sizes. Right, the 32 depth of the tree seems reasonable and large enough, and perhaps uint32 would also do the job. But considering that this data is not being transmitted as part of the proof on the wire, it does not cost bandwidth. Would it be of any help to change it to uint32? if yes, then we can do it.

Copy link
Contributor

@s1fr0 s1fr0 Jun 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem here is not about bandwidth but the architecture: on 32 bits machines usize is 4 bytes and you then need a safe conversion from u64 to usize in order to pass this to the RLN API. The other way around, i.e. when serializing the input, keeping index at 8 bytes, requires casting to a non-native type (u64) before serializing it. This extra overhead and extra checks don't make sense here if 4 bytes out of 8 will always be 0. We can address this in a later PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, if the concern is about interaction with the Rust lib API, then the rln lib and the nim code will run and be compiled on the same machine with the same architecture, right? then my understanding is that uint will be rendered with an identical size both by the nim code and by the underlying lib, so no conversion would be needed. Or are you referring to some other issue?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue will be then in the deserialization: right now you read a fixed amount of 8 bytes (the common usize byte-size over 64 bits architectures); to make doable what you say you need to first get the size of usize (4 or 8, but not necessarily) and then read that amount of bytes to deserialize the index. To me makes sense to just work with u32.

## output_buffer holds the proof data and should be parsed as |proof<256>|root<32>|epoch<32>|share_x<32>|share_y<32>|nullifier<32>|
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated to this PR, but the zk proof can be compressed to 128 bytes (only x coordinates are kept for G1,G2,G3 + 1 bit for the sign of y), so this will improve data communication.

Copy link
Contributor Author

@staheri14 staheri14 Jun 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! but it comes with a trade-off where it would save us bandwidth and will add computation to the verifying peers.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The overhead is 3 square roots (i.e. 3 exponentiations) in Fr vs. 128 bytes space per proof per node. While the proof expansion can be done in parallel with other proofs, increased proof size will affect bandwidth and thus will slower propagation in the network of messages. It's very unclear how the two interplay though, so hard to give an opinion without proper benchmarks. We can just keep this in mind for the moment and do some tests later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, makes sense to consider it as a potential optimization. Though, I am not quite sure whether the extra bytes for the proof will cause any extra delay. But the extra computation will certainly add overhead to the verification time hence delaying message propagation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The delay would be mainly in bandwidth/data transfer if you want it to scale: 1M messages exchanged x 128 extra bytes ~ 128MB ~ 6s at 20MB/s (optimistic or pessimistic?) per node. To be better, square roots if Fp have to take around 200ns each. Will see with the integration of zerokit if we can have some benchmark.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the clarification, I understand your point, but that depends on the messaging rate of the application which in your example is 1 million per second which may or may not be the case for all the apps. Let's see what benchmarking tells us.

## sizes are in bytes
## integers wrapped in <> indicate value sizes in bytes
## the return bool value indicates the success or failure of the operation
##
proc verify*(ctx: RLN[Bn256],
proof_buffer: ptr Buffer,
result_ptr: ptr uint32): bool {.importc: "verify".}
## proof_buffer [ proof<256>| root<32>| epoch<32>| share_x<32>| share_y<32>| nullifier<32> | signal_len<8> | signal<var> ]
## the return bool value indicates the success or failure of the call to the verify function
## the result of the verification of the zk proof is stored in the value pointed by result_ptr, where 0 indicates success and 1 is failure


#----------------------------------------------------------------------------------------------
#-------------------------------- Common procedures -------------------------------------------

proc new_circuit_from_params*(merkle_depth: uint,
parameters_buffer: ptr Buffer,
ctx: ptr RLN[Bn256]): bool {.importc: "new_circuit_from_params".}

## creates an instance of rln object as defined by the rln lib https://github.com/kilic/rln/blob/7ac74183f8b69b399e3bc96c1ae8ab61c026dc43/src/public.rs#L48
## merkle_depth represent the depth of the Merkle tree
## parameters_buffer holds prover and verifier keys
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be precise, parameters_buffer here corresponds to the user's proving key. The proving key, in turn, contains a copy of the verifying key.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious on how this distinction would matter? In my view, it generates CRS that embodies both the proving keys and the verification keys.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't matter indeed, it is just a clarification: the proving key just contains the verification key. IMO a RLN-relay node doesn't need to generate proofs but only verify them, so in some circumstance (optimization?) it might make sense to separate the two (proving keys are bigger than verification keys).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Proof generation is needed as part of message publishing, it contains both capabilities i.e., relaying and publishing.

## ctx holds the final created rln object
## the return bool value indicates the success or failure of the operation


proc hash*(ctx: RLN[Bn256],
inputs_buffer: ptr Buffer,
output_buffer: ptr Buffer): bool {.importc: "signal_to_field".}
## as explained in https://github.com/kilic/rln/blob/7ac74183f8b69b399e3bc96c1ae8ab61c026dc43/src/public.rs#L135, it hashes (sha256) the plain text supplied in inputs_buffer and then maps it to a field element
## this proc is used to map arbitrary signals to field element for the sake of proof generation
## inputs_buffer holds the hash input as a byte seq
## the hash output is generated and populated inside output_buffer
## the output_buffer contains 32 bytes hash output

{.pop.}