-
Notifications
You must be signed in to change notification settings - Fork 6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve back-off logic for retrying after load errors #3370
Comments
Without a justification that sounds like quite an arbitrary statement. Please explain why ;)... |
When there is a server problem that affects all clients at the same time then all client requests suddenly become synchronized as each client retries at fixed intervals from that point onwards. This can lead to a Thundering Herd Problem (similar: Cache Stampede). Exponential backoff mitigates the problem by cutting back the total request rate and accelerating the desynchronization on subsequent errors. Adding random jitter would spread out the load even further. See https://www.awsarchitectureblog.com/2015/03/backoff.html for more information on this. |
I think adding some jitter to ensure de-synchronization a fairly non-controversial suggestion. We should do that. I'm not sure how obvious it is that exponential back-off is better than what we're doing currently, however. Reducing the total request rate might better for server-side issues, but might be worse for other types of connectivity problems. For example if there's a local connectivity issue it's probably better for the client to retry quite aggressively, since this will allow for more rapid recovery and minimize the risk of user visible re-buffering. As an aside, we do eventually plan to allow injection of the retry policy, which will allow applications to change the back-off logic if they want non-standard behavior. |
That sounds fair. Being able to inject a custom retry policy would be great. 👍 Ideally, it would also be possible to differentiate between server errors and network errors and apply different policies. An intermediate solution with random jitter could maybe look similar to this: private long getRetryDelayMillis() {
Random random = new Random();
int randomJitter = random.nextInt(1000);
return Math.min((errorCount - 1) * 1000 + randomJitter, 5000);
} |
Issue:#3370 ------------- Created by MOE: https://github.com/google/moe MOE_MIGRATED_REVID=201996109
Issue:#2844 Issue:#3370 Issue:#2981 ------------- Created by MOE: https://github.com/google/moe MOE_MIGRATED_REVID=204149284
Issue:#2844 Issue:#3370 Issue:#2981 ------------- Created by MOE: https://github.com/google/moe MOE_MIGRATED_REVID=206927295
Please try implementing your own LoadErrorHandlingPolicy (while adding your own back-off logic) and let us know if you run into any issues. |
The current loader implementation retries after an error with linear backoff:
ExoPlayer/library/core/src/main/java/com/google/android/exoplayer2/upstream/Loader.java
Line 408 in deb9b30
Instead it should use exponential backoff to reduce the potential API load in case of many client errors. That could look like this:
The text was updated successfully, but these errors were encountered: