-
Notifications
You must be signed in to change notification settings - Fork 117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a download set. #1041
Add a download set. #1041
Conversation
f072afc
to
1ee16c0
Compare
Rebased onto main to pick up the state fixes and the improved message handling work. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good refactor - it should make the sync process easier to understand and maintain.
I'm looking forward to seeing the test results. Let me know when you're ready for a review or testing.
1ee16c0
to
923b31c
Compare
Rebased onto main again and redid the cancellation handling in the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved once the clippy error is fixed
Could this PR fix the sync hang in #1181? |
34c16c4
to
ef82204
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code looks good, but I haven't really had a chance to run it.
This PR doesn't contain any tests, what's the plan for testing it?
PR #1193 adds a limited sync test for the first large checkpoint, but it won't find later stalls or errors. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me but I'll wait on merging per others' feedback
ef82204
to
8276c5f
Compare
8276c5f
to
acc6985
Compare
This includes the changes from #1089, in an effort to get the simplified sync logic done in one change. The sync algorithm as designed had some invariants that the different parts were supposed to maintain together, and I think we ran into problems changing some of the parts one at a time. So, this PR tries to clean up the sync algorithm in a coherent way. I tested it on mainnet and testnet. On testnet it makes progress but is very slow, on account of the bad condition of the testnet. There are some ancillary changes made along the way while testing, mainly to the span and tracing handling. There's a new debug line added before queuing downloads that reveals that a big contributor to latency of the sync while checkpointing mainnet is the state service (checking that new hashes are not already part of the state). I think that this could be safely omitted from the extendtips step, but not obtaintips. (That was part of the original motivation for having those be two separate steps). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm seeing a lot of errors and state issues after this PR. But the main branch and #1198 work fine.
Oct 23 16:42:13.201 INFO sync:obtain_tips:download_and_verify{hash=block::Hash("00000060a9b03c2f920c3e6e0b92c5b8a253d035222dd7e6a379a126a9682135")}:checkpoint: zebra_consensus::checkpoint: verified checkpoint range block_count=4000 current_range=(Excluded(Height(0)), Included(Height(4000)))
Oct 23 16:42:13.601 WARN sync: zebrad::components::sync: e=Checkpoint(Duplicate { height: Height(215) })
Oct 23 16:42:13.601 INFO sync: zebrad::components::sync: waiting to restart sync timeout=45s
Oct 23 16:42:58.603 INFO sync: zebrad::components::sync: starting sync, obtaining new tips
Oct 23 16:42:58.603 INFO sync:obtain_tips: zebra_state::util: created block locator tip_height=Height(2) min_locator_height=0 locators=[Height(2), Height(1), Height(0)]
Note that 4000 blocks were verified, but the block locator starts at height 2.
this problem is in the state, not the sync code in this PR
We handle request cancellation in two places: before we transition into the AwaitingResponse state, and while we are in AwaitingResponse. We need both places, or else if we started processing a request, we wouldn't process the cancellation until the timeout elapsed. The first is a check that the oneshot is not already canceled. For the second, we wait on a cancellation, either from a timeout or from the tx channel closing.
These messages might be unsolicited, or they might be a response to a request we already canceled. So don't fail the whole connection, just drop the message and move on.
This makes two changes relative to the existing download code: 1. It uses a oneshot to attempt to cancel the download task after it has started; 2. It encapsulates the download creation and cancellation logic into a Downloads struct.
Try to use the better cancellation logic to revert to previous sync algorithm. As designed, the sync algorithm is supposed to proceed by downloading state prospectively and handle errors by flushing the pipeline and starting over. This hasn't worked well, because we didn't previously cancel tasks properly. Now that we can, try to use something in the spirit of the original sync algorithm.
The hedge middleware implements hedged requests, as described in _The Tail At Scale_. The idea is that we auto-tune our retry logic according to the actual network conditions, pre-emptively retrying requests that exceed some latency percentile. This would hopefully solve the problem where our timeouts are too long on mainnet and too slow on testnet.
This lets us keep the main loop simple and just write `continue 'sync;` to keep going.
Using the cancel_handles, we can deduplicate requests. This is important to do, because otherwise when we insert the second cancel handle, we'd drop the first one, cancelling an existing task for no reason.
The timeout behavior in zebra-network is an implementation detail, not a feature of the public API. So it shouldn't be mentioned in the doc comments -- if we want timeout behavior, we have to layer it ourselves.
The previous debug output printed a message that the chain verifier had recieved a block. But this provides no additional information compared to printing no message in chain::Verifier and a message in whichever verifier the block was sent to, since the resulting spans indicate where the block was dispatched. This commit also removes the "unexpected high block" detection; this was an artefact of the original sync algorithm failing to handle block advertisements, but we don't have that problem any more, so we can simplify the code by eliminating that logic.
This reveals that there may be contention in access to the state, as this takes a long time.
There's no reason to return a pre-Buffer'd service (there's no need for internal access to the state service, as in zebra-network), but wrapping it internally removes control of the buffer size from the caller.
The original sync algorithm split the sync process into two phases, one that obtained prospective chain tips, and another that attempted to extend those chain tips as far as possible until encountering an error (at which point the prospective state is discarded and the process restarts). Because a previous implementation of this algorithm didn't properly enforce linkage between segments of the chain while extending tips, sometimes it would get confused and fail to discard responses that did not extend a tip. To mitigate this, a check against the state was added. However, this check can cause stalls while checkpointing, because when a checkpoint is reached we may suddenly need to commit thousands of blocks to the state. Because the sync algorithm now has a a `CheckedTip` structure that ensures that a new segment of hashes actually extends an existing one, we don't need to check against the state while extending a tip, because we don't get confused while interpreting responses. This change results in significantly smoother progress on mainnet.
We already use an actor model for the state service, so we get an ordered sequence of state queries by message-passing. Instead of performing reads in the futures we return, this commit performs them synchronously. This means that all sled access is done from the same task, which (1) might reduce contention (2) allows us to avoid using sled transactions when writing to the state. Co-authored-by: Jane Lusby <jane@zfnd.org> Co-authored-by: Jane Lusby <jane@zfnd.org> Co-authored-by: Deirdre Connolly <deirdre@zfnd.org>
Co-authored-by: teor <teor@riseup.net>
We should error if we notice that we're attempting to download the same blocks multiple times, because that indicates that peers reported bad information to us, or we got confused trying to interpret their responses.
7b8f8f7
to
a689832
Compare
Rebased again because Dependabot caused a merge conflict. The clippy errors are unrelated to this PR. |
WIP, cannot be tested because state changes broke the sync component. Based on top of #1040.