Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a Jitter strategy to the Retry policy #245

Closed
CESARDELATORRE opened this issue May 4, 2017 · 18 comments
Closed

Add a Jitter strategy to the Retry policy #245

CESARDELATORRE opened this issue May 4, 2017 · 18 comments
Labels

Comments

@CESARDELATORRE
Copy link

I think this is not available in Polly's retry policy.
Basically, a regular Retry policy can impact your system in cases of high concurrency and scalability and under high contention.
Would be great to have it Polly and not very complicated to add Jitter to the retry algorithm/poilicy.
It'd improve the overall performance to the end-to-end system by adding randomness to the exponential backoff. It'd spread out the spikes when issues arise.

The problem is explained here:
https://brooker.co.za/blog/2015/03/21/backoff.html
https://www.awsarchitectureblog.com/2015/03/backoff.html

If this is already available in Polly, please, tell me how to implement its usage.
Thanks

@reisenberger
Copy link
Member

reisenberger commented May 4, 2017

hey @CESARDELATORRE . Great idea! (seen this in a Java resilience library)

It could already be achieved with Polly, by using one of the .WaitAndRetry(...) configuration overloads which allow you to specify a Func<..., TimeSpan> for the amount of wait. (Similar overloads exist for async.)

Something like:

Random jitterer = new Random(); 
Policy
  .Handle<HttpResponseException>() // etc
  .WaitAndRetry(5,  
      retryAttempt => TimeSpan.FromSeconds(Math.Pow(2, retryAttempt))  // exponential back-off
                    + TimeSpan.FromMilliseconds(jitterer.Next(0, 100)) // plus some jitter
  );

Does this cover it?

Such a great idea, I'll aim to add a wiki 'how to' page and/or blog it.

EDIT: Thx for the references! Reading the articles in detail, they explore a more sophisticated range of jitter algorithms than the small amount of jitter in my example above. However, the principle of how to do this with Polly is the same: using the Func<..., Timespan> overloads, you have complete control to adopt whatever randomness/jitter algorithm you like.

@reisenberger
Copy link
Member

reisenberger commented May 5, 2017

Alternative: I note that the 'Decorrelated Jitter' suggested in this article appears to use an Accumulator approach (next value depends on the preceding one).

To implement this with Polly, another alternative could be a WaitAndRetry() overload taking an IEnumerable sleepDurations, and use the standard LINQ Aggregate for the Accumulator?

CORRECTION: It needs a yield return type approach (see below).

@KennethWKZ
Copy link

@reisenberger tried to use decorrelated jitter that part of code but got error with not able to convert class TimeSpan to IEnumerable

Anyway to solve this? not very familiar on this.

@reisenberger
Copy link
Member

reisenberger commented Aug 4, 2017

@KennethWKZ I have revised the previous sketch to use a yield return type approach, per below. Please let us know if you need anything else on this.

public static IEnumerable<TimeSpan> DecorrelatedJitter(int maxRetries, TimeSpan seedDelay, TimeSpan maxDelay)
{
    Random jitterer = new Random();
    int attempt = 0;

    double seed = seedDelay.TotalMilliseconds;
    double max = maxDelay.TotalMilliseconds;
    double current = seed;

    while (attempt++ <= maxRetries) // EDIT: As pointed out in a later comment, this boundary check allows one more retry than prescribed.
    {
        current = Math.Min(max, Math.Max(seed, current * 3 * jitterer.NextDouble())); // adopting the 'Decorrelated Jitter' formula from https://www.awsarchitectureblog.com/2015/03/backoff.html.  Can be between seed and previous * 3.  Mustn't exceed max.
        yield return TimeSpan.FromMilliseconds(current);
    }
}

Used as (for example) :

    Policy retryWithDecorrelatedJitter = Policy
        .Handle<WhateverException>()
        .WaitAndRetry(DecorrelatedJitter(maxRetries, seedDelay, maxDelay));

@KennethWKZ
Copy link

@reisenberger Thanks!!! Now I'm more understanding how it's works~!

@KennethWKZ
Copy link

@reisenberger

current = Math.Min(max, Math.Max(seed, current * 3 * jitterer.NextDouble())); // adopting the 'Decorrelated Jitter' formula from the quoted article. //Can be between seed and previous * 3. Mustn't exceed max.

May I asking this part? Is it me should set the seed lower than max? Example: max is 1000ms, so seed will be 100ms?

@reisenberger
Copy link
Member

@KennethWKZ Yes. As with a pure exponential-backoff strategy, the idea is that seed is a low-ish, starting value.

  • The algorithm will produce sleep values all of which fall between seed and max: seed should represent the minimum wait-before-retry that you want, max the max.
  • max >= 4 * seed works well (and bigger ratios are fine). If seed and max are any closer, values will tend to bunch at the min and max.

To see the kind of retry delays it generates, you can run up a small console app like this.

@reisenberger
Copy link
Member

Closing. Detailed wiki page now created describing how to use Polly with jitter.

@23W
Copy link

23W commented Jan 2, 2018

Excuse me, why do you use loop with "<=" comparison criteria in DecorrelatedJitter, while (attempt++ <= maxRetries)? I think it should be strict "<", shouldn't it? So right code is:

public static IEnumerable<TimeSpan> DecorrelatedJitter(int maxRetries, TimeSpan seedDelay, TimeSpan maxDelay)
{
    Random jitterer = new Random();
    int attempt = 0;

    double seed = seedDelay.TotalMilliseconds;
    double max = maxDelay.TotalMilliseconds;
    double current = seed;

    while (attempt++ < maxRetries)
    {
        current = Math.Min(max, Math.Max(seed, current * 3 * jitterer.NextDouble()));
        yield return TimeSpan.FromMilliseconds(current);
    }
}

@reisenberger
Copy link
Member

@23W I agree. I have annotated the above and corrected it in both the wiki page and github gist example. Thank you for catching this.

@sahir
Copy link

sahir commented Sep 25, 2019

@CESARDELATORRE is there any way to perform 25 retries over approximately 21 days using jitter strategy on the Retry Policy?

@reisenberger
Copy link
Member

Hi @sahir , thanks for the q.

Polly only runs in process, in memory, and has no persistent backing store for retry state. It is not designed for retry loops that might span several days. While it's theoretically possible, if the process running such a long retry loop with Polly were to crash, the retry state would be lost. The use cases Polly targets are instead short-lived, transient faults.

There are no plans to take Polly in the direction of having a backing store for long-lived retries, because there are already a number of solutions in the market for scheduled job engines with persistence. eg in Azure: Azure timer-triggered functions; Azure Durable functions with delay; or something built around timer-triggered Azure Logic Apps. In a Windows or web app: hangfire, quartz.net; Timer-driven invocations on a method in a background IHostedService in .NET Core - to give some examples. Some of these are only scheduled-job orchestrators with persistence - within that, you might have to write some code to check whether your task was done or needed further retrying, depending on your needs.

Hope that helps.

@sahir
Copy link

sahir commented Sep 26, 2019

Hi, @reisenberger Thanks For The answer.

@reisenberger i retry failures with an exponential backoff using the formula (retry_count ** 4) + 15 + (rand(30) * (retry_count + 1)) (i.e. 15, 16, 31, 96, 271, ... seconds + a random amount of time). I assume It will perform 25 retries over approximately 21 days. can you please check the code.

But code is not unit Tested yet

public static class RetryWithExponentialBackoff
    {
        public static Policy GetRetryPolicyHandler()
        {
            Random random       = new Random();
            double maxDelay     = (Math.Pow(25,4) + 15 + (random.Next(30) * 25 + 1));
            TimeSpan seedDelay = TimeSpan.FromMilliseconds(100);
            Policy retryWithDecorrelatedJitter = Policy
                   .Handle<Exception>()
                   .WaitAndRetry(DecorrelatedJitter(25, seedDelay, maxDelay), onRetry: (e, t) => Console.WriteLine($"Retry delay: {t.TotalMilliseconds} ms."));

            return retryWithDecorrelatedJitter;
        }

        public static IEnumerable<TimeSpan> DecorrelatedJitter(int maxRetries, TimeSpan seedDelay, double maxDelay)
        {
            Random jitterer = new Random();
            int retries = 0;

            double seed = seedDelay.TotalMilliseconds;
            double max = maxDelay;
            double current = seed;

            while (++retries <= maxRetries)
            {
                current = Math.Min(max, Math.Max(seed, current * 3 * jitterer.NextDouble()));
                yield return TimeSpan.FromMilliseconds(current);
            }
        }
}

@reisenberger
Copy link
Member

reisenberger commented Sep 26, 2019

@sahir . As long as you are happy that the retry sequence will be lost if the process terminates, a Polly wait-and-retry policy (including jitter variants) can in principle permit waiting before each retry for as long as TimeSpan can be configured for (more than enough ;~).

@sahir This issue #245 is old and the code in this thread is no longer recommended. Our up-to-date jitter documentation is here. Awesome community contributors have helped bring forward new and refined jitter algorithms as part of the Polly.Contrib.WaitAndRetry package.

Let me know if there is anything else we can help with on this (edit: or open an issue on the Polly.Contrib.WaitAndRetry repo if it relates to code there). (Once we are done here, I will probably lock this issue and post a note to indicate clearly that this issue doesn't reflect current recommended jitter practice.)

@reisenberger
Copy link
Member

reisenberger commented Sep 26, 2019

@sahir I wanted to be sure to answer your question: "can you please check the code". Was the goal to assess the distribution of retry intervals the code would generate? (Edit: If not, please could you clarify?)

Most jitter formulae return an IEnumerable<TimeSpan> to use in the Polly policy. To explore typical retry delays generated (and experiment with the effect of changing parameter values), I would suggest starting by Console.WriteLine-ing from a small console app:

class Program
{
    static void Main(string[] args)
    {
        IEnumerable<TimeSpan> timeSpans = ... // get the enumerable you are trialling
        foreach (TimeSpan timeSpan in timeSpans)
        {
            Console.WriteLine(timeSpan.ToString()); // Use a timespan format string (https://docs.microsoft.com/en-us/dotnet/standard/base-types/standard-timespan-format-strings) or custom format string (https://docs.microsoft.com/en-us/dotnet/standard/base-types/custom-timespan-format-strings) if you want.
            // or use `Console.WriteLine(timeSpan.TotalSeconds)` or similar, to get pure numbers.
        }
    }
}

Of course, that only shows results one "run" at a time - you need to run it a few times to get a feel for the results.

If you need a more sophisticated approach, you can run a similar experiment a large number of times and aggregate the results. For example, for the new jitter algorithm in Polly.Contrib.WaitAndRetry, we aggregated data over 100000 runs and then graphed this data to assess the distribution.

@sahir
Copy link

sahir commented Sep 27, 2019

@reisenberger Thanks For The answer.

@reisenberger
Copy link
Member

@sahir . If your goal is to generate a broadly exponential backoff (1, 2, 4, 8 seconds etc) with some additional randomness, try the new jitter algorithm in Polly.Contrib.WaitAndRetry; the new algorithm is much more strongly correlated to exponential backoff than the one originally in this thread.

Going to lock this thread now to make it clearer that the jitter strategies here are no longer recommended, but @sahir : if we can help further, please do open another issue on Polly, or on Polly.Contrib.WaitAndRetry if your question is more directly related to the formula there.

@reisenberger
Copy link
Member

Notice: The jitter code in this thread is no longer the Polly recommendation for jitter. Please see our jitter documentation for latest information.

@App-vNext App-vNext locked and limited conversation to collaborators Sep 28, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

5 participants