Commit 776b25f
committed
fix: Decrease RPC failover threshold & cooldown duration
Currently, there must be 28 consecutive failed attempts to hit an RPC
endpoint before it is perceived to be unavailable. At this point, if the
endpoint is configured with a failover, the failover will be activated
and all requests will be automatically diverted to it; otherwise
requests are paused for 30 minutes.
There are two problems we are seeing now:
- If Infura is degraded enough to where failing over to QuickNode is
warranted, the failover may be activated too late.
- If the user is attempting to access a custom network and they are
experiencing issues (either due to a local connection issue or an
issue with the endpoint itself), they will be prevented from using
that network for 30 minutes. This is way too long.
To fix these problems, this commit:
- Lowers the "max consecutive failures" (the number of successive
attempts to obtain a successful response from an endpoint before
requests are paused or the failover is triggered) from 28 to 8.
- Lowers the "circuit break duration" (the period during which requests
to an unavailable endpoint will be paused) from 30 minutes to 30
*seconds*.
In summary, if a network starts to become degraded or the user is
experiencing connection issues, the network is more likely to be flagged
as unavailable, but if the situation improves the user may be able to
use it more quickly.
How quickly does the circuit break now? It depends on whether the user
is using Chrome or Firefox and whether the errors encountered are
retriable or non-retriable:
- Retriable errors (e.g. connection errors, 502/503/504, etc.) will, as
the name implies, be automatically retried. If these errors are
continually produced, the circuit will break very quickly (if the
extension is restarted, then it will break immediately).
- Non-retriable errors (e.g. 4xx errors) do not get automatically
retried, so it takes longer for the circuit to break (if the extension
is restarted, on average it will take about 1 minute).
- Note that Chrome implements "anti-DDoS throttling logic" which means
that some non-retriable errors will turn into retriable errors. In
this situation the circuit breaks faster than it would on Firefox.1 parent 9094185 commit 776b25f
1 file changed
+25
-11
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
157 | 157 | | |
158 | 158 | | |
159 | 159 | | |
160 | | - | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
161 | 163 | | |
162 | 164 | | |
163 | 165 | | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
164 | 171 | | |
165 | 172 | | |
166 | 173 | | |
167 | 174 | | |
168 | 175 | | |
169 | 176 | | |
170 | | - | |
171 | | - | |
172 | | - | |
173 | | - | |
174 | | - | |
175 | | - | |
176 | | - | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
177 | 185 | | |
178 | 186 | | |
179 | 187 | | |
180 | 188 | | |
181 | 189 | | |
182 | 190 | | |
183 | 191 | | |
184 | | - | |
185 | | - | |
186 | | - | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
187 | 201 | | |
188 | 202 | | |
189 | 203 | | |
| |||
0 commit comments