Skip to content

Commit

Permalink
Fix next_retry busy waiting on first retry (#4192)
Browse files Browse the repository at this point in the history
The `next_retry_time` gets populated when a request receives an error
timeout or any other error, after thatn next_retry would check all
requests in the queue returns the smallest one, which then gets used to
move the main loop by creating a Delay
```
futures_timer::Delay::new(instant.saturating_duration_since(Instant::now())).await,
```

However when we retry a task for the first time we still keep it in the
queue an mark it as in flight so its next_retry_time would be the oldest
and it would be small than `now`, so the Delay will always triggers, so
that would make the main loop essentially busy wait untill we received a
response for the retry request.

Fix this by excluding the tasks that are already in-flight.

---------

Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>
Co-authored-by: Andrei Sandu <54316454+sandreim@users.noreply.github.com>
  • Loading branch information
alexggh and sandreim authored Apr 18, 2024
1 parent 9f12d21 commit 0e55289
Show file tree
Hide file tree
Showing 2 changed files with 26 additions and 2 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -288,7 +288,7 @@ impl RequestManager {
/// Returns an instant at which the next request to be retried will be ready.
pub fn next_retry_time(&mut self) -> Option<Instant> {
let mut next = None;
for (_id, request) in &self.requests {
for (_id, request) in self.requests.iter().filter(|(_id, request)| !request.in_flight) {
if let Some(next_retry_time) = request.next_retry_time {
if next.map_or(true, |next| next_retry_time < next) {
next = Some(next_retry_time);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2606,7 +2606,31 @@ fn should_delay_before_retrying_dropped_requests() {
// Sleep for the given amount of time. This should reset the delay for the first candidate.
futures_timer::Delay::new(REQUEST_RETRY_DELAY).await;

// We re-try the first request.
// We re-try the first request the second time drop it again.
assert_matches!(
overseer.recv().await,
AllMessages::NetworkBridgeTx(NetworkBridgeTxMessage::SendRequests(mut requests, IfDisconnected::ImmediateError)) => {
assert_eq!(requests.len(), 1);
assert_matches!(
requests.pop().unwrap(),
Requests::AttestedCandidateV2(outgoing) => {
assert_eq!(outgoing.peer, Recipient::Peer(peer_c));
assert_eq!(outgoing.payload.candidate_hash, candidate_hash_1);
assert_eq!(outgoing.payload.mask, mask);
}
);
}
);

assert_matches!(
overseer_recv_with_timeout(&mut overseer, Duration::from_millis(100)).await,
None
);

// Sleep for the given amount of time. This should reset the delay for the first candidate.
futures_timer::Delay::new(REQUEST_RETRY_DELAY).await;

// We re-try the first request, for the third time, so let's answer to it.
{
let statements = vec![
state
Expand Down

0 comments on commit 0e55289

Please sign in to comment.