feat(transport): retry layer #849

yash-atreya · 2024-06-07T14:14:36Z

Motivation

Port foundry's layer into the transport crate.
Had quite a few requests for a retry layer from users, and also, this would remove a bunch of code from the Foundry.

Closes #717
Closes #140

Solution

Port from Foundry, still WIP.

Closes #288

PR Checklist

Added Tests
Added Documentation
Breaking changes

prestwich · 2024-06-07T14:35:31Z

crates/transport/src/layers/retry.rs

+                    return should_retry_json_rpc_error(&resp);
+                }
+
+                // some providers send invalid JSON RPC in the error case (no `id:u64`), but the


how very kind of them

this would be true ONLY of http, right? as in WS we would REQUIRE the ID in order to determine which response got rate limited?

does that imply that we should resolve this in the HTTP transports by writing the ID into the resp if it's missing?

actually i like this idea more now. basically try this as a fallback deser in the http transports, and insert the ID if it works

mattsse

this is what we've been in using in foundry

we could rethink some of the decisions here, but overall this has been working well, for example there's an argument for removing the cups budgeting entirely and instead entirely rely on ratelimit error responses

there are a few native retry types in tower like https://tower-rs.github.io/tower/tower/retry/budget/index.html#reexport.TpsBudget

but they don't really fit our use case

prestwich · 2024-06-07T14:48:49Z

crates/transport/src/layers/retry.rs

+fn should_retry_json_rpc_error(error: &ErrorPayload) -> bool {
+    let ErrorPayload { code, message, .. } = error;
+    // alchemy throws it this way
+    if *code == 429 {


excuse me this is insane

prestwich · 2024-06-07T14:50:27Z

this is a good inclusion. the amount of stupid wrangling makes me sad

crates/transport/src/layers/retry.rs

crates/transport/src/error.rs

crates/transport/src/layers/retry.rs

prestwich · 2024-06-10T13:54:53Z

crates/transport/src/layers/retry.rs

+                    return should_retry_json_rpc_error(&resp);
+                }
+
+                // some providers send invalid JSON RPC in the error case (no `id:u64`), but the


this would be true ONLY of http, right? as in WS we would REQUIRE the ID in order to determine which response got rate limited?

does that imply that we should resolve this in the HTTP transports by writing the ID into the resp if it's missing?

crates/transport/src/error.rs

crates/transport/src/layers/retry.rs

crates/transport/src/error.rs

crates/transport/src/layers/retry.rs

crates/transport/src/error.rs

Co-authored-by: Matthias Seitz <matthias.seitz@outlook.de>

crates/transport/src/error.rs

crates/transport/src/layers/retry.rs

crates/transport/src/error.rs

crates/transport/src/layers/retry.rs

crates/transport/Cargo.toml

crates/transport/src/layers/retry.rs

crates/transport/src/lib.rs

mattsse

the batch retry logic is very hard to follow, not immediately obvious what this does.

I suggest we don't do any of this at first, just to make some progress on the pr and retry the entire call

mattsse · 2024-06-21T10:43:11Z

crates/transport/src/error.rs

@@ -110,3 +111,65 @@ impl HttpError {
        false
    }
 }
+
+/// Extension trait to implement methods for [`RpcError<TransportErrorKind, E>`].
+pub trait RpcErrorExt {


why do we need this?

Created this to implement is_retryable_err and backoff_hint methods specifically on RpcError<TransportErrorKind> and not the RpcError<E, ErrResp> generic type.

but this is only implemented for RpcError?

could we at least make this pub(crate)?

crates/transport/src/layers/retry.rs

mattsse · 2024-06-21T10:46:56Z

crates/transport/src/layers/retry.rs

+                        for r in batch_res {
+                            if r.is_error() {
+                                batch_errs.push(r);
+                            } else {
+                                batch_success.push(r.clone());
+                                // Remove corresponding request from the batch
+                                req.remove_by_id(&r.id);
+
+                                if req.is_empty() {
+                                    let response_packet =
+                                        ResponsePacket::from(batch_success.clone());
+                                    batch_success.clear();
+                                    batch_errs.clear();
+
+                                    return Ok(response_packet);
+                                }
+                            }
+                        }
+
+                        let e = batch_errs.first().unwrap().payload.as_error().unwrap().clone();


what does this do?

batch handling looks very complex,

this also only uses the first batch_err, but there could be many

mattsse · 2024-06-21T10:53:27Z

crates/transport/src/layers/retry.rs

+                        if !batch_success.is_empty() {
+                            // Join batch_success and batch_errs
+                            let mut batch_calls = batch_success;
+                            batch_calls.append(&mut batch_errs);
+                            return Ok(ResponsePacket::from(batch_calls));


doesn't this mess with the request ordering?

yash-atreya · 2024-06-27T08:10:22Z

@mattsse ptal.

I've made some changes to maintain the ordering of the responses corresponding to the incoming requests.

Few highlights:

We store initial incoming requests into requets_order. This is used to reference the original order and arrange the responses accordingly.
We collect all responses (regardless of whether it is a success or err) into batch_responses: HashMap<Id, Response> where key is the request Id.
As the requests are retried the new response replaces the old one in the batch_responses HashMap.
Requests are churned/removed from the next retry batch using req.remove_by_id(&id), if a success response is returned or when non-retryable is encountered.
The batch_responses is returned as a ResponsePacket when:
- No more requests are left to be retried i.e req.is_empty().
- Max retries available retries have been exhausted.
- Non retryable errors are encountered.

mattsse

the batch retry logic is a bit complex and not easy to follow, imo this feature isn't super important. I'd rather make some progress on this pr and exclude this entirely, because this has been open for a while now.

it turns out batch requests don't necessarily have ordering guarantees:

https://docs.alchemy.com/reference/batch-requests#what-is-a-batch-request

so we can actually drastically simplify this, by doing a partition over success/errors and only retry.

but I'd like to do that separately and get this included without batch retry support

mattsse · 2024-07-02T11:34:55Z

crates/transport/src/error.rs

@@ -110,3 +111,65 @@ impl HttpError {
        false
    }
 }
+
+/// Extension trait to implement methods for [`RpcError<TransportErrorKind, E>`].
+pub trait RpcErrorExt {


but this is only implemented for RpcError?

could we at least make this pub(crate)?

mattsse · 2024-07-02T11:40:42Z

crates/transport/src/layers/retry.rs

+                            if r.is_error() {
+                                batch_responses.insert(r.id.clone(), r.clone());
+


this looks redundant?

mattsse

lgtm

feat(transport): port foundry retry layer

5a37e32

prestwich reviewed Jun 7, 2024

View reviewed changes

mattsse reviewed Jun 7, 2024

View reviewed changes

prestwich reviewed Jun 7, 2024

View reviewed changes

yash-atreya added 3 commits June 10, 2024 14:11

feat(transport): HTTPError struct in TransportErrorKind

44c6a78

nit

a9bde9f

nit: use std::thread::sleep

242e4fb

yash-atreya marked this pull request as ready for review June 10, 2024 09:08

yash-atreya requested review from DaniPopes, gakonst, onbjerg and Evalir as code owners June 10, 2024 09:08

onbjerg mentioned this pull request Jun 10, 2024

Document usage of tower middleware like retry #140

Closed

yash-atreya requested review from mattsse and prestwich June 10, 2024 10:01

prestwich reviewed Jun 10, 2024

View reviewed changes

crates/transport/src/layers/retry.rs Outdated Show resolved Hide resolved

prestwich reviewed Jun 10, 2024

View reviewed changes

crates/transport/src/error.rs Outdated Show resolved Hide resolved

prestwich requested changes Jun 10, 2024

View reviewed changes

mattsse linked an issue Jun 10, 2024 that may be closed by this pull request

[Feature] Add standalone HttpError response error type #288

Closed

mattsse requested changes Jun 10, 2024

View reviewed changes

crates/transport/src/error.rs Outdated Show resolved Hide resolved

crates/transport/src/layers/retry.rs Outdated Show resolved Hide resolved

crates/transport/src/error.rs Outdated Show resolved Hide resolved

yash-atreya added 3 commits June 10, 2024 23:37

use tokio::sleep

06d3e49

fix: HttpError and retry err parsing

3eeec66

Merge branch 'main' into yash/retry-layer

46b2b8c

prestwich requested changes Jun 10, 2024

View reviewed changes

crates/transport/src/error.rs Outdated Show resolved Hide resolved

crates/transport/src/error.rs Outdated Show resolved Hide resolved

crates/transport/src/layers/retry.rs Outdated Show resolved Hide resolved

crates/transport/src/layers/retry.rs Show resolved Hide resolved

mattsse requested changes Jun 10, 2024

View reviewed changes

crates/transport/src/error.rs Outdated Show resolved Hide resolved

yash-atreya added 4 commits June 11, 2024 12:40

impl RpcErrorExt

9a690f6

resolve conflicts

7a494b8

dedup

153bb35

clippy

26820a1

yash-atreya and others added 2 commits June 11, 2024 07:13

suggested nits

e13d393

Co-authored-by: Matthias Seitz <matthias.seitz@outlook.de>

nit

fa2f1f5

yash-atreya requested a review from mattsse June 11, 2024 11:15

gakonst approved these changes Jun 11, 2024

View reviewed changes

crates/transport/src/error.rs Outdated Show resolved Hide resolved

crates/transport/src/layers/retry.rs Show resolved Hide resolved

mattsse requested changes Jun 11, 2024

View reviewed changes

crates/transport/src/error.rs Outdated Show resolved Hide resolved

crates/transport/src/layers/retry.rs Show resolved Hide resolved

nits

a1c21c1

prestwich reviewed Jun 12, 2024

View reviewed changes

crates/transport/Cargo.toml Show resolved Hide resolved

prestwich reviewed Jun 12, 2024

View reviewed changes

crates/transport/src/layers/retry.rs Outdated Show resolved Hide resolved

prestwich reviewed Jun 12, 2024

View reviewed changes

crates/transport/src/layers/retry.rs Outdated Show resolved Hide resolved

prestwich reviewed Jun 12, 2024

View reviewed changes

crates/transport/src/layers/retry.rs Outdated Show resolved Hide resolved

prestwich reviewed Jun 12, 2024

View reviewed changes

crates/transport/src/layers/retry.rs Show resolved Hide resolved

prestwich reviewed Jun 12, 2024

View reviewed changes

crates/transport/src/layers/retry.rs Outdated Show resolved Hide resolved

prestwich reviewed Jun 12, 2024

View reviewed changes

crates/transport/src/layers/retry.rs Outdated Show resolved Hide resolved

prestwich reviewed Jun 12, 2024

View reviewed changes

crates/transport/src/lib.rs Show resolved Hide resolved

nits

3ebccf4

yash-atreya requested review from prestwich and mattsse June 12, 2024 05:30

yash-atreya mentioned this pull request Jun 12, 2024

feat(transport): HttpError #882

Merged

3 tasks

resolve conflicts

2d6f86b

mattsse requested changes Jun 21, 2024

View reviewed changes

mattsse requested changes Jul 2, 2024

View reviewed changes

yash-atreya force-pushed the yash/retry-layer branch from 0583277 to 2d6f86b Compare July 2, 2024 13:56

yash-atreya added 2 commits July 2, 2024 19:28

nits

cb2eed9

rm timeout retries

60a7f9f

yash-atreya mentioned this pull request Jul 5, 2024

refactor(common): use alloy retry layer foundry-rs/foundry#8368

Merged

mattsse approved these changes Jul 5, 2024

View reviewed changes

yash-atreya merged commit 96e3a84 into main Jul 5, 2024
22 checks passed

yash-atreya deleted the yash/retry-layer branch July 5, 2024 11:19

ben186 pushed a commit to ben186/alloy that referenced this pull request Jul 27, 2024

feat(transport): retry layer (alloy-rs#849)

c9dfc80

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(transport): retry layer #849

feat(transport): retry layer #849

yash-atreya commented Jun 7, 2024 •

edited

Loading

prestwich Jun 7, 2024

prestwich Jun 10, 2024

prestwich Jun 10, 2024 •

edited

Loading

mattsse left a comment •

edited

Loading

prestwich Jun 7, 2024

prestwich commented Jun 7, 2024

prestwich Jun 10, 2024

mattsse left a comment

mattsse Jun 21, 2024

yash-atreya Jun 25, 2024

mattsse Jul 2, 2024

mattsse Jun 21, 2024

mattsse Jun 21, 2024

yash-atreya commented Jun 27, 2024

mattsse left a comment

mattsse Jul 2, 2024

mattsse Jul 2, 2024

mattsse left a comment

		if r.is_error() {
		batch_responses.insert(r.id.clone(), r.clone());

feat(transport): retry layer #849

feat(transport): retry layer #849

Conversation

yash-atreya commented Jun 7, 2024 • edited Loading

Motivation

Solution

PR Checklist

Choose a reason for hiding this comment

Choose a reason for hiding this comment

prestwich Jun 10, 2024 • edited Loading

Choose a reason for hiding this comment

mattsse left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

prestwich commented Jun 7, 2024

Choose a reason for hiding this comment

mattsse left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yash-atreya commented Jun 27, 2024

mattsse left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mattsse left a comment

Choose a reason for hiding this comment

yash-atreya commented Jun 7, 2024 •

edited

Loading

prestwich Jun 10, 2024 •

edited

Loading

mattsse left a comment •

edited

Loading