-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore(rln-relay): add isReady check #1989
Conversation
You can find the image built from this PR at
|
e710467
to
99670c6
Compare
You can find the experimental image built from this PR at
|
cc: @alrevuelta i think this pr should be separate from the one that creates an api endpoint to reduce cognitive load - wdyt? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rymnc agree, we can have 2 PRs.
Not sure having a isReady
variable is a good idea. I would have just a function isReady()
that check the difference between the last processed block in the rln tree and the latest head block in the blockchain (perhaps allowing 1 or 2 blocks of difference).
As it is, you may set g.isReady = true
but at any point in time you can lose sync and unless im missing something you wont set isReady back to false. Imagine for example that the eth node falls behind sync.
tldr: isReady()
function but no stored isReady
state. This state is calculated upon request. wdyt?
I think that's overkill - when we are out of sync, we do set isReady to false (check the ethRpc.onDisconnect) |
yep but there are other ways you can get out of sync without ethRpc.onDisconnect. For example, if the eth node gets out of sync, then you will fall behind while staying connected. You assume that once you are in sync for the first time, you stay in sync forever (unless disconnected). What happens also if your node can't process all the membership insertions realtime and falls behind? |
hmm, how would it be possible to detect if the eth node itself is out of sync without using multiple providers?
|
afaik even if the eth node is out of sync, it knows that its out sync. meaning that it knows the latest head block known by the blockchain and the latest known block by the node. I think this is the rpc link I guess you will also notice indirectly. If the eth node falls behind sync, it wont give you the latest blocks, so you will fall behind sync, seeing that the latestHead is way beyond your latest localRlnLatestProcessedBlock. |
if the eth node falls behind sync, would it still emit the newHead events? unsure about this behaviour - i do agree about calculating every time isReady is called though |
have never tried empirically. I would bet that it stops emitting events, and you will fall behind sync without noticing. |
yes so then there are 2 ways we will be out of sync -
we cannot account for 1 unless we use multiple providers |
why? an eth node out of sync will know its out of sync. so with just one node you should know that its out of sync. or? afaik we can address both 1 and 2 without multiple providers unless im missing something. |
maybe are you suggesting that we make the rpc call every time the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Super interesting idea!
In further PRs, I think we will need to add a /healthcheck
endpoint as a valid REST path ( cc @NagyZoltanPeter )
Yes exactly, perhaps this: https://docs.infura.io/networks/ethereum/json-rpc-methods/eth_syncing#returns I'm fine with leaving a TODO for this (your point 1). But would like to address 2. here.
tracked here: |
I like the idea of having /healthcheck! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@alrevuelta I hope cbb266a covers the requirement (1 & 2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks! left a comment of a possible edge case, let me know if im missing something.
@@ -73,6 +73,8 @@ type | |||
# in event of a reorg. we store 5 in the buffer. Maybe need to revisit this, | |||
# because the average reorg depth is 1 to 2 blocks. | |||
validRootBuffer*: Deque[MerkleNode] | |||
# this variable tracks the last seen head | |||
lastSeenBlockHead*: BlockNumber |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure i understand the need of this variable, and perhaps there is an edge case?
afaiu we sync in batches:
- batch1: fromblock1 toblock1
- batch2: fromblock2 toblock2
- etc
And lastSeenBlockHead
takes the values of toblock1, toblock2, etc?
So what happens when we just processed batch1. In this exact moment we latestProcessedBlock = lastSeenBlockHead
so we may think we are in sync. But in reality we are in batch1 and batch2 is still left.
So the lastSeenBlockHead
should be the last head block known by the blockchain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right, I'll just move it to the newHeadCallback, thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed in 3c9d4be
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the lastSeenBlockHead should be the last head block known by the blockchain.
I think its safer to get this from eth_blockNumber
endpoint rather than relying on the newHeadcallback. If the eth node fails to keep you updated with the latest head (but without disconnecting) then your lastSeenBlockHead will be outdated.
But if you use eth_blockNumber
, since its req/rep, you ensure that your latest head is correct. If then node fails to reply, then its not ready.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed in 15bfde4
error "failed to get the syncing status", error = getCurrentExceptionMsg() | ||
return false | ||
|
||
method isReady*(g: OnchainGroupManager): Future[bool] {.async,gcsafe.} = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perhaps its cleaner to run the checks for true
all together?
if g.ethRpc.isSome() && g.lastSeenBlockHead !=0 && g.latestProcessedBlock >= g.lastSeenBlockHead && !await g.isSyncing():
return true
return false
maybe just a personal prefernce, just a sugerence.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I did think of this, imo the way it is right now is slightly more readable
@@ -551,10 +553,12 @@ method isReady*(g: OnchainGroupManager): Future[bool] {.async,gcsafe.} = | |||
if g.ethRpc.isNone(): | |||
return false | |||
|
|||
if g.lastSeenBlockHead == 0: | |||
return false | |||
let currentBlock = cast[BlockNumber](await g.ethRpc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
guess if this fails an exceptions is thrown? should we catch it and return false
if so. guess if we dont catch it the node will crash?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed in b058c97
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm!
Description
Adds an
isReady
proc to waku_node, waku_rln_relayChanges
Issue
Addresses 25. #1906