Improve back-off logic for retrying after load errors #3370

joluet · 2017-10-18T13:53:29Z

The current loader implementation retries after an error with linear backoff:

ExoPlayer/library/core/src/main/java/com/google/android/exoplayer2/upstream/Loader.java

Line 408 in deb9b30

return Math.min((errorCount - 1) * 1000, 5000);

Instead it should use exponential backoff to reduce the potential API load in case of many client errors. That could look like this:

private long getRetryDelayMillis() {
      return Math.min( Math.pow(errorCount - 1, 2) * 1000, 8000);
}

ojw28 · 2017-10-18T13:59:53Z

Without a justification that sounds like quite an arbitrary statement. Please explain why ;)...

joluet · 2017-10-18T14:37:08Z

When there is a server problem that affects all clients at the same time then all client requests suddenly become synchronized as each client retries at fixed intervals from that point onwards. This can lead to a Thundering Herd Problem (similar: Cache Stampede).

Exponential backoff mitigates the problem by cutting back the total request rate and accelerating the desynchronization on subsequent errors. Adding random jitter would spread out the load even further.

See https://www.awsarchitectureblog.com/2015/03/backoff.html for more information on this.

ojw28 · 2017-10-18T15:00:30Z

I think adding some jitter to ensure de-synchronization a fairly non-controversial suggestion. We should do that. I'm not sure how obvious it is that exponential back-off is better than what we're doing currently, however. Reducing the total request rate might better for server-side issues, but might be worse for other types of connectivity problems. For example if there's a local connectivity issue it's probably better for the client to retry quite aggressively, since this will allow for more rapid recovery and minimize the risk of user visible re-buffering.

As an aside, we do eventually plan to allow injection of the retry policy, which will allow applications to change the back-off logic if they want non-standard behavior.

joluet · 2017-10-18T15:23:02Z

That sounds fair.

Being able to inject a custom retry policy would be great. 👍 Ideally, it would also be possible to differentiate between server errors and network errors and apply different policies.
This way exponential backoff could be used for server errors and rapid retry for network errors.

An intermediate solution with random jitter could maybe look similar to this:

private long getRetryDelayMillis() {
      Random random = new Random();
      int randomJitter = random.nextInt(1000);
      return Math.min((errorCount - 1) * 1000 + randomJitter, 5000); 
}

Issue:#3370 ------------- Created by MOE: https://github.com/google/moe MOE_MIGRATED_REVID=201996109

Issue:#2844 Issue:#3370 Issue:#2981 ------------- Created by MOE: https://github.com/google/moe MOE_MIGRATED_REVID=204149284

Issue:#2844 Issue:#3370 Issue:#2981 ------------- Created by MOE: https://github.com/google/moe MOE_MIGRATED_REVID=206927295

AquilesCanta · 2018-09-17T15:56:42Z

Please try implementing your own LoadErrorHandlingPolicy (while adding your own back-off logic) and let us know if you run into any issues.

ojw28 changed the title ~~Loader should retry with exponential backoff~~ Improve back-off logic for retrying after load errors Oct 18, 2017

ojw28 added the enhancement label Oct 18, 2017

AquilesCanta self-assigned this Nov 2, 2017

ojw28 pushed a commit that referenced this issue Jun 25, 2018

Allow configuration of the Loader retry delay

a1f89be

Issue:#3370 ------------- Created by MOE: https://github.com/google/moe MOE_MIGRATED_REVID=201996109

ojw28 pushed a commit that referenced this issue Jul 12, 2018

Add LoadErrorHandlingPolicy to customize blacklisting and backoff logic

32a91b5

Issue:#2844 Issue:#3370 Issue:#2981 ------------- Created by MOE: https://github.com/google/moe MOE_MIGRATED_REVID=204149284

ojw28 pushed a commit that referenced this issue Aug 6, 2018

Parameterize load error handling in ExtractorMediaSource

d458b90

Issue:#2844 Issue:#3370 Issue:#2981 ------------- Created by MOE: https://github.com/google/moe MOE_MIGRATED_REVID=206927295

AquilesCanta closed this as completed Sep 17, 2018

google locked and limited conversation to collaborators Jan 31, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve back-off logic for retrying after load errors #3370

Improve back-off logic for retrying after load errors #3370

joluet commented Oct 18, 2017

ojw28 commented Oct 18, 2017

joluet commented Oct 18, 2017

ojw28 commented Oct 18, 2017

joluet commented Oct 18, 2017

AquilesCanta commented Sep 17, 2018

Improve back-off logic for retrying after load errors #3370

Improve back-off logic for retrying after load errors #3370

Comments

joluet commented Oct 18, 2017

ojw28 commented Oct 18, 2017

joluet commented Oct 18, 2017

ojw28 commented Oct 18, 2017

joluet commented Oct 18, 2017

AquilesCanta commented Sep 17, 2018