Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Zerokit RLN module in nwaku #1030

Closed
wants to merge 30 commits into from
Closed

Use Zerokit RLN module in nwaku #1030

wants to merge 30 commits into from

Conversation

s1fr0
Copy link
Contributor

@s1fr0 s1fr0 commented Jun 30, 2022

This PR replaces Kilic's RLN implementation with Zerokit RLN module in nwaku.

Minor changes to the nim wrappers were required, as well us update of constants in tests.

EDIT: we decided to have a compilation flag to switch between the two RLN implementation, at least until RLN-Relay gets extensively tested. Note that Zerokit RLN implementation and the nwaku wrappers/tests contained in this PR implement the application-specific rlnIdentifier as described in 17/WAKU2-RLNRELAY, while current master does not (and such rlnIdentifier logic is not executed when Kilic's RLN module is selected at compilation time).

@s1fr0 s1fr0 requested a review from staheri14 June 30, 2022 16:37
@s1fr0 s1fr0 requested a review from oskarth June 30, 2022 16:39
@s1fr0
Copy link
Contributor Author

s1fr0 commented Jun 30, 2022

Tests failures are affected by zerokit RLN slow circuit loading, bringing total execution time to exceed 60 minutes: this issue is tracked in vacp2p/zerokit#23.

@status-im-auto
Copy link
Collaborator

status-im-auto commented Jun 30, 2022

Jenkins Builds

Click to see older builds (2)
Commit #️⃣ Finished (UTC) Duration Platform Result
765e8c5 #1 2022-06-30 21:59:50 ~15 min windows 📄log
765e8c5 #1 2022-07-01 03:53:10 ~6 min macos 📄log
Commit #️⃣ Finished (UTC) Duration Platform Result
80dddff #2 2022-07-06 21:58:17 ~14 min windows 📄log
80dddff #2 2022-07-07 03:52:15 ~5 min macos 📄log
80dddff #1 2022-07-22 11:57:28 ~7 min linux 📄log
✔️ 32bea0d #3 2022-07-31 04:04:45 ~17 min macos 📦bin

@staheri14
Copy link
Contributor

Thanks @s1fr0 for the PR, that's great to see that zerokit is ready! 👍
May I ask please instead of replacing the current rln-lib, to move the zero-kit part under a compiler flag so that we can switch between the two? The reason is that rln-relay onchain testnet is almost ready for launch, and by having zero-kit integrated it becomes very hard to perform debugging during dogfooding. As soon as the dogfooding finishes and we make sure the onchain rln-relay part is solid, then we can integrate zero-kit and do another test.
cc: @oskarth

@s1fr0
Copy link
Contributor Author

s1fr0 commented Jul 5, 2022

@staheri14 Thank you Sanaz! Adding a compilation flag for zerokit RLN is not a problem, but rather the fact that APIs slightly differ and hence are not fully interchangeable. AFAIK, or either the two waku sources (interfacing, respectively, with Kilic's and zerokit's RLN libs) live in parallel and a compilation flag selects which one will be compiled (but lots of code will be duplicated), or we can have many when defined statements for each group of changes.
If we want to keep zerokit in beta for a little while, probably the first solution is the cleanest (e.g. move sources to a subfolder like zerokit-rln-integration) and easier to maintain with respect to any eventual update to current waku-rln-relay implementation (i.e., based on Kilic's RLN)

Copy link
Contributor

@oskarth oskarth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice one!

# and run the following command in the root directory of the cloned project
# cargo run --example export_test_keys
# the file is generated separately and copied here
parameters = readFile("waku/v2/protocol/waku_rln_relay/parameters.key")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How come this isn't needed anymore?

@@ -371,7 +357,7 @@ suite "Waku rln relay":
hashSuccess
let outputArr = cast[ptr array[32, byte]](outputBuffer.`ptr`)[]
check:
"efb8ac39dc22eaf377fe85b405b99ba78dbc2f3f32494add4501741df946bd1d" ==
"4c6ea217404bd5f10e243bac29dc4f1ec36bf4a41caba7b4c8075c54abb3321e" ==
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any idea why these are different? (Not saying they shouldn't be, just wondering if we understand reason)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the main reason is that Kilic's library is internally using Bn256, while in zerokit RLN I use circom's Bn254.

## index is the position of the id commitment key to be deleted from the tree
## the deleted id commitment key is replaced with a zero leaf
## the return bool value indicates the success or failure of the operation

proc get_root*(ctx: RLN[Bn256], output_buffer: ptr Buffer): bool {.importc: "get_root".}
proc get_root*(ctx: ptr RLN, output_buffer: ptr Buffer): bool {.importc: "get_root".}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How come this is difference?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Current RLN object instantiation is not parametrized by the curve equation (it's hardcoded to Bn254), but this will likely change in the future. Passing the pointer rather than the object is much more efficient and makes more sense IMO since its instantiation and processing should be managed by the RLN library and the object should not be copied around by nim implementation (also the pointer is what is returned by the new_circuit call).

## get_root populates the passed pointer output_buffer with the current tree root
## the output_buffer holds the Merkle tree root of size 32 bytes
## the return bool value indicates the success or failure of the operation
##

proc get_merkle_proof*(ctx: ptr RLN, index: uint, output_buffer: ptr Buffer): bool {.importc: "get_proof".}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd probably keep this as qualified merkle proof, as "get proof" could be a ZK proof

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So is fine like this, or you mean changing the zerokit RLN api name to get_merkle_proof too?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_merkle_proof

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This nim wrapper is already named get_merkle_proof and with this name the API is used internally in nwaku. If you mean to change zerokit RLN public API name, this has to be addressed with an issue in vacp2p/zerokit.

input_buffer: ptr Buffer,
output_buffer: ptr Buffer): bool {.importc: "generate_rln_proof".}
## input_buffer has to be serialized as [ id_key<32> | id_index<8> | epoch<32> | signal_len<8> | signal<var> ]
## output_buffer holds the proof data and should be parsed as [ proof<128> | share_y<32> | nullifier<32> | root<32> | epoch<32> | share_x<32> | rln_identifier<32> ]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How come proof size is different?I find the order of share_/share_y/nullifier a bit odd, any particular reason for changing this order?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Proof is 128 bytes since it's compressed: 254 bits for each X coordinate of the 3 proof points + 1 bit for the sign of the y coordinate = 32 bytes per coordinate x 4 coordinates (2x1 over G1, 1x2 over G2) = 128 bytes. Compressed proofs are the default returned by ark-circom create_random_proof_with_reduction and verify_proof, so to make the API the same I should have gone through a decompression and re-compression adding useless overhead. There is probably a way to directly process proofs in uncompressed form (to avoid square roots operation to recover the right y coordinate), but here we discussed how we can benchmark if this makes actually sense.

The comment is wrong with respect to the implementation: the current version is
[ proof<128> | root<32> | epoch<32> | share_x<32> | share_y<32> | nullifier<32> | rln_identifier<32> ]

The reason for such order of variables was in order to follow the order of the circuit's variables, cf. https://github.com/privacy-scaling-explorations/rln/blob/616ee9b0b085bdf14e7f39df884496b8e77ddc2f/circuits/rln.circom#L5 and https://github.com/privacy-scaling-explorations/rln/blob/616ee9b0b085bdf14e7f39df884496b8e77ddc2f/circuits/rln-base.circom#L60-L62. For serialization/deserialization purposes, this order makes more sense to me since this is the order parameters should be passed to the verify_proof function. However, ultimately, I didn't want to change too much nwaku code, so I ultimately decided the keep the previously implemented order and add rln_identifier at the end.
Can be easily changed though, since the zerokit witness structure abstracts the fields order: it is just a matter of redefining the serialization/deserialization order.

Comment updated in 80dddff

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yay for compressed proofs!

@@ -55,6 +52,8 @@ type RateLimitProof* = object
## nullifier enables linking two messages published during the same epoch
## see details in https://hackmd.io/tMTLMYmTR5eynw2lwK9n1w?view#Nullifiers
nullifier*: Nullifier
## Application specific RLN Identifier
rlnIdentifier*: RlnIdentifier
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call, we should probably add this to RLN relay spec @staheri14

@@ -98,6 +97,9 @@ const
ETH_CLIENT* = "ws://localhost:8540/"

const
# The relative folder where the circuit, proving and verification key for RLN can be found
# Note that resources has to be compiled with respect to the above MERKLE_TREE_DEPTH
RLN_RESOURCE_FOLDER* = "vendor/zerokit/rln/resources/tree_height_" & $MERKLE_TREE_DEPTH & "/"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How come we need this now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order to not have duplicates of the circuit in nwaku, I thought that it is better to point to zerokit resource folder directly. Since nwaku is calling zerokit API from a different path than the latter, the resource folders should be passed to correctly locate the circuit (i.e., this string cannot be hardcoded in zerokit)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I don't know how exactly but I suspect this might fail when we do distribution. In general vendoring and static resources should be kept separate. Perhaps OK for initial integration where we compile stuff ourselves.

cc @jm-clius

@@ -362,6 +368,7 @@ proc encode*(nsp: RateLimitProof): ProtoBuffer =
output.write3(4, nsp.shareX)
output.write3(5, nsp.shareY)
output.write3(6, nsp.nullifier)
output.write3(7, nsp.rlnIdentifier)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Requires spec change I believe

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tracking issue: vacp2p/rfc#523


return ok(output)

proc serialize(proof: RateLimitProof, data: openArray[byte]): seq[byte] =
## a private proc to convert RateLimitProof and data to a byte seq
## this conversion is used in the proof verification proc
## the order of serialization is based on https://github.com/kilic/rln/blob/7ac74183f8b69b399e3bc96c1ae8ab61c026dc43/src/public.rs#L205
## [ proof<256>| root<32>| epoch<32>| share_x<32>| share_y<32>| nullifier<32> | signal_len<8> | signal<var> ]
## [ proof<128> | root<32> | epoch<32> | share_x<32> | share_y<32> | nullifier<32> | rln_identifier<32> | signal_len<8> | signal<var> ]
let lenPrefMsg = appendLength(@data)
var proofBytes = concat(@(proof.proof),
@(proof.merkleRoot),
@(proof.epoch),
@(proof.shareX),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This order of share x / y / nullifier makes more sense to me

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@oskarth oskarth requested a review from jm-clius July 5, 2022 11:49
@oskarth
Copy link
Contributor

oskarth commented Jul 5, 2022

@jm-clius re gracefully integrating this option

Switching between:

  • Nothing
  • RLN kilic
  • Zerokit RLN seems reasonable short term to me

Once we know Zerokit RLN works correctly and with OK perf we can replace kilic version

if (pbytes.len == 0):
debug "error in parameters.key"
return err("error in parameters.key")
resourcesPathBuffer = RLN_RESOURCE_FOLDER.toOpenArrayByte(0, RLN_RESOURCE_FOLDER.high).toBuffer()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd do error handling on this, assume rln resource folder can't be found / is null

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error handling is done in zerokit RLN (cf. 1, 2, 3) but we might add an extra error handling layer in nwaku as well if better.

Copy link
Contributor

@jm-clius jm-clius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM, but I assume that, as discussed during offsite, you want to maintain both Kilic and Zerokit implementations until after the testnet? My suggestion would be to simply split the relevant RLN modules into the "old" implementation and the "new" implementation using compiler definitions (when defined...). Should be a simple copy-paste operation - the result will be ugly, but we can mitigate it a bit by clearly separating the two implementations per module using comments/dividers. Hopefully this dual-impl will live only for a short time!

check:
(await completionFut.withTimeout(10.seconds)) == true
(await completionFut.withTimeout(200.seconds)) == true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just double checking that 200 is indeed necessary. :) Any idea what the usual successful loading time is, more or less? I know the circom loading takes a long time, but waiting > 3 mins for the test to fail could be frustrating during regular CI.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not necessary anymore with zerokit rln perfomance fixes.

@jm-clius
Copy link
Contributor

@jm-clius re gracefully integrating this option

Switching between:

  • Nothing
  • RLN kilic
  • Zerokit RLN seems reasonable short term to me

Once we know Zerokit RLN works correctly and with OK perf we can replace kilic version

As commented above, I think separating affected modules between "old" and "new" implementations using compiler definitions seem the fastest, easiest-to-maintain option to me. Only if it's short-lived, of course. :D

@s1fr0
Copy link
Contributor Author

s1fr0 commented Jul 26, 2022

@jm-clius @oskarth @staheri14 what would be the default for CI then? If Kilic's RLN, zerokit integration does not get automatically tested; if, instead, by default we set zerokit, then this might create problems in next stages of RLN testing.

For this reason I think the best would be to have a subfolder zerokit in waku/v2/protocol/waku_rln_relay with the appropriate edited sources for zerokit and add a when defined(zerokit) in all_tests_v2.nim to load a separate test_waku_rln_relay_zerokit.nim file.

In this way both implementations can be tested independently.

@s1fr0 s1fr0 added the track:rln RLN Track (Secure Messaging/Applied ZK), e.g. relay and applications label Jul 26, 2022
@oskarth
Copy link
Contributor

oskarth commented Aug 1, 2022

@staheri14 would you able to review this PR? It has been awaiting your review for a while now I believe.

(I requested a review from Ksr for general code integration, assuming Hanno/Lorenzo won't be back soon)

@oskarth oskarth removed request for oskarth and jm-clius August 1, 2022 05:50
Copy link
Contributor

@kaiserd kaiserd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM from an integration point of view. I do not have enough background to give feedback / evaluate details.
Some parts are raw, but as the comments suggest, this will be addressed once the move to zerokit has been completed.

contract(MembershipContract):
proc register(pubkey: Uint256) # external payable
proc MemberRegistered(pubkey: Uint256, index: Uint256) {.event.}
# proc registerBatch(pubkeys: seq[Uint256]) # external payable
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will these be implemented later?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! This is a question for @staheri14. I just made test_waku_rln_onchain.nim to load and correctly process zerokit RLN wrappers.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but right now, there is no actual use case for them.

@staheri14
Copy link
Contributor

staheri14 commented Aug 2, 2022

@staheri14 would you able to review this PR? It has been awaiting your review for a while now I believe.

(I requested a review from Ksr for general code integration, assuming Hanno/Lorenzo won't be back soon)

Based on the conversation, I was under the impression that the PR is still in progress. Doing review now.

Copy link
Contributor

@staheri14 staheri14 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left my comments, the major issue is with all the files duplicated from the original rln-relay module which is unnecessary. The only thing that needs to be duplicated (if any) is the rln.lib which contains the binding for a different c lib.

contract(MembershipContract):
proc register(pubkey: Uint256) # external payable
proc MemberRegistered(pubkey: Uint256, index: Uint256) {.event.}
# proc registerBatch(pubkeys: seq[Uint256]) # external payable
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but right now, there is no actual use case for them.

@@ -790,7 +790,6 @@ procSuite "WakuNode":
await node1.publish(rlnRelayPubSubTopic, message)
await sleepAsync(2000.millis)


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this line is removed unintentionally.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The previous extra line doesn't seem to be consistent with the remaining tests.

Comment on lines +13 to +19
## TODO: properly address these import once zerokit rln is merged
when defined(rln) or (not defined(rln) and not defined(rlnzerokit)):
import ../protocol/waku_rln_relay/waku_rln_relay_types

when defined(rlnzerokit):
import ../protocol/waku_rln_relay/rlnzerokit/waku_rln_relay_types

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
## TODO: properly address these import once zerokit rln is merged
when defined(rln) or (not defined(rln) and not defined(rlnzerokit)):
import ../protocol/waku_rln_relay/waku_rln_relay_types
when defined(rlnzerokit):
import ../protocol/waku_rln_relay/rlnzerokit/waku_rln_relay_types
## TODO: properly address these import once zerokit rln is merged
when defined(rln):
import ../protocol/waku_rln_relay/waku_rln_relay_types
else:
when defined(rlnzerokit):
import ../protocol/waku_rln_relay/rlnzerokit/waku_rln_relay_types

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Such separation is not needed when the same file is used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This won't work, since for example if neither RLN=true or RLNZEROKIT=true is set, then https://github.com/status-im/nwaku/blob/311fc253b4074b5c17bc2056b29d6e1d8f32a20e/waku/v2/node/config.nim#L126 would not be defined.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once we remove duplicate waku-rln-relay-types.nim files, then there is no need for conditional import.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines +13 to +18
## TODO: properly address these import once zerokit rln is merged
when defined(rln) or (not defined(rln) and not defined(rlnzerokit)):
import ../protocol/waku_rln_relay/waku_rln_relay_types

when defined(rlnzerokit):
import ../protocol/waku_rln_relay/rlnzerokit/waku_rln_relay_types
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
## TODO: properly address these import once zerokit rln is merged
when defined(rln) or (not defined(rln) and not defined(rlnzerokit)):
import ../protocol/waku_rln_relay/waku_rln_relay_types
when defined(rlnzerokit):
import ../protocol/waku_rln_relay/rlnzerokit/waku_rln_relay_types
## TODO: properly address these import once zerokit rln is merged
when defined(rln):
import ../protocol/waku_rln_relay/waku_rln_relay_types
else:
when defined(rlnzerokit):
import ../protocol/waku_rln_relay/rlnzerokit/waku_rln_relay_types

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Such separation is not needed when the same file is used.

Copy link
Contributor Author

@s1fr0 s1fr0 Aug 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This won't work, since for example if neither RLN=true or RLNZEROKIT=true is set, then https://github.com/status-im/nwaku/blob/311fc253b4074b5c17bc2056b29d6e1d8f32a20e/waku/v2/node/wakunode2_types.nim#L44 would not be defined.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once we remove duplicate waku-rln-relay-types.nim files, then there is no need for conditional import.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

waku-rln-relay-types.nim and rlnzerokit/waku-rln-relay-types.nim are not the same file

Comment on lines +17 to +23
## TODO: properly address these import once zerokit rln is merged
when defined(rln) or (not defined(rln) and not defined(rlnzerokit)):
import waku_rln_relay/waku_rln_relay_types

when defined(rlnzerokit):
import waku_rln_relay/rlnzerokit/waku_rln_relay_types

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
## TODO: properly address these import once zerokit rln is merged
when defined(rln) or (not defined(rln) and not defined(rlnzerokit)):
import waku_rln_relay/waku_rln_relay_types
when defined(rlnzerokit):
import waku_rln_relay/rlnzerokit/waku_rln_relay_types
```suggestion
## TODO: properly address these import once zerokit rln is merged
when defined(rln):
import ../protocol/waku_rln_relay/waku_rln_relay_types
else:
when defined(rlnzerokit):
import ../protocol/waku_rln_relay/rlnzerokit/waku_rln_relay_types

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Such separation is not needed when the same file is used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This won't work, since for example if neither RLN=true or RLNZEROKIT=true is set, then https://github.com/status-im/nwaku/blob/311fc253b4074b5c17bc2056b29d6e1d8f32a20e/waku/v2/protocol/waku_message.nim#L40 would not be defined.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto, Once we remove duplicate waku-rln-relay-types.nim files, then there is no need for conditional import.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -0,0 +1,107 @@
# src: https://github.com/kilic/rlnapp/blob/master/packages/contracts/contracts/RLN.sol
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we have a new file for the contract?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not really necessary, but the reasoning behind is described in #1030

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see #1030 (comment)

@@ -0,0 +1,376 @@
when defined(rlnzerokit):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we have a new file for the rln_relay_types?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for utils, there are minor differences such us rlnInstance is a pointer to RLN, here there is rlnIdentifier, etc.

Having a separate/standalone folder, even with mostly duplicated code, will make easy to just move/replace these files with current ones in order to replace kilic's RLN library.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -0,0 +1,841 @@
{.push raises: [Defect].}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no need for a duplicate of waku_rln_relay_utils, you can use the existing one

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are minor differences, e.g. not using Bn256, rlnInstance is a pointer to RLN, new_circuit doesn't use parameters.key, etc.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The RLN construct definition should be moved to the rln.nim module and then all the imports of rln.nim module should be managed under a compiler flag.

Copy link
Contributor

@staheri14 staheri14 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned before, there are unnecessary duplicates of the original waku_rln_relay module which should be removed. This can be addressed by moving the RLN construct definition to the rln.nim module and then managing all the imports of rln.nim module under a compiler flag.

@s1fr0
Copy link
Contributor Author

s1fr0 commented Aug 2, 2022

@staheri14 This would not suffice and nonetheless differs from current implementation. Even by moving the new RLN object definition in rln.nim, there are many differences that would require as many when defined here and there to make it work. This translate to lots of effort to remove all such conditions when we finally switch to zerokit. By keeping them in a separate folder, although much code is redundant, it will result in just a move of files out of the rlnzerokit folder and remove of the when defined conditions in header of changed files. Hopefully, both kilic and zerokit module are expected to co-exists in nwaku for a short period only.

@staheri14
Copy link
Contributor

staheri14 commented Aug 2, 2022

@staheri14 This would not suffice and nonetheless differs from current implementation. Even by moving the new RLN object definition in rln.nim, there are many differences that would require as many when defined here and there to make it work. This translate to lots of effort to remove all such conditions when we finally switch to zerokit. By keeping them in a separate folder, although much code is redundant, it will result in just a move of files out of the rlnzerokit folder and remove of the when defined conditions in header of changed files. Hopefully, both kilic and zerokit module are expected to co-exists in nwaku for a short period only.

@s1fr0
This is an anti-pattern that violates the DRY principle (Don't Repeat Yourself).
This change is cloning the full copy of the library for just modifying a few parts. This damages the flexibility, maintainability, and useability of the code over time. This change also compromises the integrity of the code as we need to manually ensure that all future changes are reflected in both the original module as well as its clone. This is exactly why compiler flags are there in the first place to avoid such damage to our codebase. Hence, please stick to the best practice principles to avoid introducing such antipatterns to our code base and persist its flexibility, maintainability, and reusability. Also, nothing guarantees how long this transient period is going to take so it is more important to keep the master branch in compliance with the engineering best practices to avoid any unpredicted damages to the integrity.

@s1fr0
Copy link
Contributor Author

s1fr0 commented Aug 3, 2022

@staheri14 This PR is based on (my understanding of) the discussion we had here and during the offisite on how to address such integration, and on the assumption that it should result as a temporary fix that can later allow easy removal of Kilic's RLN module. Note that the original goal was to make both Kilic and Zerokit RLN modules be tested one after each other by CI, which partially explains why there are duplicate files in separate folders (essentially because if something is defined alternatively with a kilic/zerokit when defined, the two alternative definitions cannot exist simultaneously during tests).

I agree with and I was aware of all the downsides you mentioned. Indeed, I'm the first one to not like this solution: in my opinion, we should either complete the RLN testing phase using Kilic's RLN module and then integrate zerokit RLN (and probably partially repeat some tests), or switch to zerokit RLN (as soon as reaches some agreed level of maturity) and then start with the testnet.

No problem in refactoring this PR so that we don't duplicate files but we have when defined only on parts that differ (I'm not sure it is possible to switch procedure's signatures through compilation flags - a thing that alone would imply a copy/paste of the whole function body and hence mainly duplicate waku_rln_relay_utils.nim). What do you think @oskarth @jm-clius?

@oskarth
Copy link
Contributor

oskarth commented Aug 3, 2022

Based on the conversation, I was under the impression that the PR is still in progress. Doing review now.

From what I can tell the PR has been waiting for reviews since here #1030 (comment)


By keeping them in a separate folder, although much code is redundant, it will result in just a move of files out of the rlnzerokit folder and remove of the when defined conditions in header of changed files. Hopefully, both kilic and zerokit module are expected to co-exists in nwaku for a short period only.

Agree, this was discussed during the offsite and this is a short-lived change. The only reason not to replace it completely in PR was to not delay the testnet with existing kilic implementation as you suggested @staheri14.

This is an anti-pattern that violates the DRY principle (Don't Repeat Yourself).

While this is generally true, there are good exceptions.

  1. If it is temporary it is often cleaner and easier to just copy paste and make existing modifications
  2. If N=2 usually it doesn't make sense to generalize yet, because you don't know the parameter space so you won't know what patterns you are repeating. In this case I hope we aren't going to introduce a third library that implements RLN, but rather reduce to N=1 over next few weeks.

What are the minimal changes, if any, we can make to this PR to meet the following objectives?

  1. Doesn't break existing code and delay testnet
  2. Makes it easy to switch to Zerokit version and matches, roughly, the existing API?

@oskarth
Copy link
Contributor

oskarth commented Aug 3, 2022

We also need an issue toggling on this version of Zerokit by default, and some acceptance criteria such as

  • no regressions (API works, no insane performance diff)
  • dynamic testnet live (? - needs milestone as well)

The sooner we make the shift the better

@staheri14
Copy link
Contributor

staheri14 commented Aug 4, 2022

@oskarth

  • In addition to my previous concerns, by introducing a duplicate module, we will lose the entire commit history of the original waku_rln_relay module and its submodules.
  • I again insist that any bug/issues that will be found during dogfooding need to be fixed in two files, in the original and the copy, and it doubles the work and increases the chances of error.
  • I still think that it is a straightforward change, by 1) finding the difference between the APIs of rlnzerokit/rln.nim and the original rln.nim (Which seem to be only new_circuit/new_circuit_from_params and verify) and 2) finding any caller procs that use those different APIs (which seem to be only createRLNInstance and proofVerify) 3) make those caller procs conditional to the compiler flags. The RLN type definition also should be replaced by the original one i.e. type RLN*[E] = pointer so that the two APIs have identical proc signatures.

Having all this said, if this is the group's decision, although I disagree, I'll commit to it.

@staheri14 staheri14 dismissed their stale review August 4, 2022 01:36

dismissing

@oskarth
Copy link
Contributor

oskarth commented Aug 4, 2022

In addition to my previous concerns, by introducing a duplicate module, we will lose the entire commit history of the original waku_rln_relay module and its submodules.

The commits are still there. It is true it makes it slightly more annoying if you are trying to do git bisect or something like that, but this is true whenever code is re-structured. I really don't see how this is a major problem.

I again insist that any bug/issues that will be found during dogfooding need to be fixed in two files, in the original and the copy, and it doubles the work and increases the chances of error.

That's fair, but I'd expect:

  1. Most major bugs have been dealt with before testnet starts, and that we are fairly confident in that it'll work (albeit maybe be slow, etc) - please correct me if I'm wrong

  2. This not being a big problem, it is just copy pasting small tweaks in two places. If we really think this is a big problem, I think it's OK to fix bugs only in the "live" version for dogfooding critical bugs. Then in the PR where we enable Zerokit by default, we just make sure those bug fixes are incorporated. Porting bug fixes is usually very quick.

I still think that it is a straightforward change, by 1) finding the difference between the APIs of rlnzerokit/rln.nim and the original rln.nim (Which seem to be only new_circuit/new_circuit_from_params and verify) and 2) finding any caller procs that use those different APIs (which seem to be only createRLNInstance and proofVerify) 3) make those caller procs conditional to the compiler flags. The RLN type definition also should be replaced by the original one i.e. type RLN*[E] = pointer so that the two APIs have identical proc signatures.

That's a fair point. I don't have any super strong opinions here and think it is largely an implementation detail, and there are pros and cons for both. When we discussed this during the offsite, I believe the consensus was that a "flatter" compiler flag in one place was less error-prone. This especially true for temporary changes, compared to if we expected the two implementations to run concurrently for an extended period of time, in which case I'd agree with you that the "diff" should be kept to a minimum.

Additionally, I believe @jm-clius was also in favor of this flatter approach in terms of integration risk, and as the main code owner of nwaku and I believe this should have extra weight in terms of reducing integration risk (see two questions points) which is important in software engineering and code change management.


All that said, if after understanding the above reasoning, you @staheri14 and @s1fr0 have a strong preference for the more granular compiler flag then go for it. I'd rather we work as a team towards the same place here rather than dismissing review. The main thing is that @jm-clius or @LNSD (since Hanno is out) is OK with this approach.

@s1fr0
Copy link
Contributor Author

s1fr0 commented Aug 4, 2022

you @staheri14 and @s1fr0 have a strong preference for the more granular compiler flag

Honestly, I never had any strong preference for either methods. I implemented this PR since it appeared to me to be the method on which we found some kind of traversal consensus.

  • In addition to my previous concerns, by introducing a duplicate module, we will lose the entire commit history of the original waku_rln_relay module and its submodules.

This is not necessary and definitely something I would care about: I would copy/paste the zerokit files over the current files in master and resolve conflicts so that changes result as incremental and the commits history remains intact.

  • I again insist that any bug/issues that will be found during dogfooding need to be fixed in two files, in the original and the copy, and it doubles the work and increases the chances of error.

This could happen as well with inline compilation flags. However, as long as we don't reach consensus on this and more commits are merged to master, I need to keep updated this and others PRs (and their double kilic/rln versions): this is not always straightforward since with indents/flags/multiple files is not easy to see the diffs quickly.

  • I still think that it is a straightforward change, by 1) finding the difference between the APIs of rlnzerokit/rln.nim and the original rln.nim (Which seem to be only new_circuit/new_circuit_from_params and verify) and 2) finding any caller procs that use those different APIs (which seem to be only createRLNInstance and proofVerify) 3) make those caller procs conditional to the compiler flags. The RLN type definition also should be replaced by the original one i.e. type RLN*[E] = pointer so that the two APIs have identical proc signatures.

Definitely not straightforward. As I said multiple times, the differences are not only in few proc definitions, but are many. Also, requiring to align completely the zerokit APIs to Kilic's RLN module, would invalidate some design choices we discussed in zerokit (e.g., delegate everything by using pointers and not touch at all the RLN object in nim/C, hardcode for now the Elliptic curve). However I'd like a bit more pragmatic on this: i pushed the PR #1060 (in alternative to this) implementing the inline compilation flags approach.

Having all this said, if this is the group's decision, although I disagree, I'll commit to it.

Now that #1060 is ready, we can decide better which approach is more appropriate for the integration of zerokit RLN. As I said above, to avoid maintaining the PRs' branches for long with the changes in master, let's just try to agree as soon as we can on one between #1030 and #1060.

@LNSD
Copy link
Contributor

LNSD commented Aug 4, 2022

Now that #1060 is ready, we can decide better which approach is more appropriate for the integration of zerokit RLN. As I said above, to avoid maintaining the PRs' branches for long with the changes in master, let's just try to agree as soon as we can on one between #1030 and #1060.

I am ok with both approaches. My concern is that we need to merge these changes as soon as possible to avoid more conflict headaches for @s1fr0.

As I commented in @kgrgpg PR, after this month's release, I would like to allocate some time to review the current RLN code. The current nim file division (utils, types, and main module) is causing some problems (e.g. merge/rebase conflicts, cycling imports, etc.) that could be solved with a better code "layout" 😁

Copy link
Contributor

@staheri14 staheri14 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@oskarth and @s1fr0 Thanks for your comments and clarification, with your suggested solutions regarding the debugging and commit history, I am convinced that we can merge this PR.

@s1fr0 Thanks for opening the other PR, this is exactly what I meant (almost), merge whichever that you think is easier for you.

@oskarth oskarth mentioned this pull request Aug 5, 2022
4 tasks
@s1fr0
Copy link
Contributor Author

s1fr0 commented Aug 5, 2022

At the end, we decided to merge the alternative implementation with inline compilation flags from #1060. I then close this PR.

@s1fr0 s1fr0 closed this Aug 5, 2022
@s1fr0 s1fr0 deleted the zerokit-rln-integration branch October 17, 2022 10:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
track:rln RLN Track (Secure Messaging/Applied ZK), e.g. relay and applications
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants