-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Relayer mvp #631
Relayer mvp #631
Conversation
I'm thinking we shouldn't check in all of this generated code, this is something that could be handled via a build.rs or other macro imo |
Do you mean the bindings json? |
Yeah it seems like this is all generated code from the json: test.rs Although it looks like there is some actual code near the bottom, making this pretty difficult to review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change looks very solid=) I definitely like that we work only with messages right now. It looks much cleaner without useless code around=)
Small comments/questions.
And another question, do we want to clean up also code associated with those tables? Or do we plan to use them later, so it is better to keep them? @Voxelot
fuel-relayer/src/relayer/get_logs.rs
Outdated
S: futures::Stream<Item = Result<Vec<Log>, ProviderError>>, | ||
{ | ||
tokio::pin!(logs); | ||
while let Some(events) = logs.try_next().await? { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we receive the error on the last page, it will force us to sync all pages again=)
Maybe we need to handle it in the future to avoid double work if we have errors here often.
Also, maybe we want to add a unit test for the case when some page fails(maybe in this PR, maybe in follow-up).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good catch.
While it's not a big deal to fail here and try again when we are mostly caught up with the relayer, this would be a fairly expensive operation to retry during a fresh sync from scratch with ethereum that takes a long time (e.g. 5+ mins worth of blocks) to paginate through. It would also cause a lot of duplicate work (I/O) to insert messages we've already seen before since the finalized da height isn't updated.
A simple fix to help mitigate this problem of throwing away 5+mins worth of API requests and synced data would be to update the finalized_da_height of fuel after each page. That way we only risk redoing at most one page worth of API requests and data.
--
Another separate concern is that the way we are inserting messsage in our DB could overwrite the spent status if the message was already downloaded on a previous attempt, allowing for double spend. For example, if the write log fails after partially saving a page and then retries again sometime later, a user could have spent one of the messages that made it into the DB (changing its state). When the relayer retries and overwrites the message, the spent status field would be reset, making the message elidgable for spending again.
We should check if the message ID exists before saving it to avoid issues with this method not being idempotent. While the previous fix reduces the window of time that this could happen, it's still a risk even if we only have a lag-1 page retry.
In the future, it might be worth tracking the spent status of messages in a separate column to be able to safely prune the data of spent messages from state without risking double-spends in case the relayer da_height is reset for some reason.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeh I actually intentionally did this because I wasn't sure about adding the complexity on tracking pagination. Maybe this would suite a follow up PR as it would be nice to get it reviewed properly. I did assume that downloading messages was idempotent but it sounds like that's not the case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with deferring this, feel free to open new issues(s) for these changes as a resolution to this thread if you'd prefer to address this later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: Make issue for the second concern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've addressed the first concern.
eth_remote_current: 300, | ||
eth_remote_finalization_period: 100, | ||
eth_local_finalized: Some(201), | ||
} => None; "ahead so doesn't need to sync" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to allow this kind of case?
If in the future we will increase the finalization_period
, then better to allow it, but if not, maybe better to handle it somehow
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's sort of a weird case because it means that we connected to an eth node A that was ahead of eth node B and then connected to B. There's not much we can really do at that point then wait but we can't prevent it.
out | ||
} | ||
|
||
async fn apply_validator_diffs( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replying to @xgreenx about old data cleanup:
Yes, we should hunt down the remaining places this unused data is being managed and clean it up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: Make issue for this
Ok I have addressed all PR feedback and now just need to make two follow on issues. |
P: Middleware<Error = ProviderError> + 'static, | ||
{ | ||
match eth_node.syncing().await { | ||
Ok(s) => Ok(s), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nit] this match could be replaced by using anyhow context instead: https://docs.rs/anyhow/latest/anyhow/trait.Context.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved - please open an issue to fix the message insertion to check existence first though.
This is a rewrite of the relayer node to simplify and only download logs. This implements #398