Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature request]: Resilience Event occured logs should be configurable as Warning unless it's the final attempt that failed #2162

Closed
sander1095 opened this issue Jun 24, 2024 · 11 comments
Assignees

Comments

@sander1095
Copy link

Is your feature request related to a specific problem? Or an existing feature?

Polly reports Resilience Event occurred .. logs as Error, even when it's the first attempt that failed and multiple attempts will still be made.

This makes the log show up as an error, possibly setting off alerts even though it isn't really an error until the retry mechanisms have truly failed.

Describe the solution you'd like

I would like to be able to configure failed retries to get logged as Warning. When the final attempt fails, that could then be logged as an error.

This means that I will only see true errors (that might need investigation) in my error screen.

Additional context

No response

@martincostello
Copy link
Member

Have you tried doing this with the changes from #2072 that are in Polly 8.4.0?

@sander1095
Copy link
Author

I seem to have overlooked that one. I just had a look, but it seems that you can only set the severity based on the (original) Severity and the EventName. I do not think that is enough to implement this feature request, as I would also need something like IsFinalAttempt or Attempt and MaxAttempts.

Also, I'm using Microsoft.Extensions.Http.Resilience instead of a direct dependency on Polly. I do not see a way to do access the TelemetryOptions in the same way when using that package. However, that's an issue related to that package and not Polly directly.

@martintmk
Copy link
Contributor

@martincostello

I am thinking that this might be even good default behavior for retry strategy? Individual attempts are warnings while the final one is error?

@martincostello
Copy link
Member

I guess that's a reasonable if we have access to the context to know if it's the last attempt.

Copy link
Contributor

This issue is stale because it has been open for 60 days with no activity. It will be automatically closed in 14 days if no further updates are made.

@github-actions github-actions bot added the stale Stale issues or pull requests label Aug 28, 2024
@sander1095
Copy link
Author

Any update on this? Would it be feasible and a useful addition?

/Cc @martincostello @martintmk

@martincostello
Copy link
Member

No update from us. Feel free to open a pull request to propose this change, then we can take a look.

@martincostello martincostello added help wanted A change up for grabs for contributions from the community and removed stale Stale issues or pull requests labels Aug 30, 2024
@peter-csala
Copy link
Contributor

I'll be off in the next two weeks, but I can start work on this request on the 16th of September week if that's not too late.

@peter-csala
Copy link
Contributor

As I can see no one has jump on this request in my absence so, @martincostello please assign this ticket to me.

@martincostello martincostello removed the help wanted A change up for grabs for contributions from the community label Sep 16, 2024
@peter-csala
Copy link
Contributor

The current state

OnRetry

The OnRetry telemetry events are always reported as Warning:

_telemetry.Report<OnRetryArguments<T>, T>(new(ResilienceEventSeverity.Warning, RetryConstants.OnRetryEvent), onRetryArgs);

  • These are reported after a failed and handled attempt but before the next attempt.
    • That means this event will not occur after the final attempt.
  • Also please bear in mind these events start with the Resilience Event occurred ... message

ExecutionAttempt

The ExecutionAttempt telemetry events are either reported as Warning or as Information whether the outcome is handled or not (respectively):

TelemetryUtil.ReportExecutionAttempt(_telemetry, context, outcome, attempt, executionTime, handle);

new(handled ? ResilienceEventSeverity.Warning : ResilienceEventSeverity.Information, ExecutionAttempt),

  • These are reported just before the next delay is calculated.
    • More precisely before the strategy decides whether it should have a new attempt or not.
    • So, this event does occur after the final attempt.
  • These events start with the Execution attempt. Source:... message

The desired state

So, what we could do is to change the ExecutionAttempt event's severity to Error if we run out of attempts. In other words, if the execution is cancelled or the final attempt's outcome is not handled then the severity will be either Warning or Information (respectively).

Basically changing this logic

TelemetryUtil.ReportExecutionAttempt(_telemetry, context, outcome, attempt, executionTime, handle);
if (context.CancellationToken.IsCancellationRequested || IsLastAttempt(attempt, out bool incrementAttempts) || !handle)
{
return outcome;
}

to something like this:

var isLastAttempt = IsLastAttempt(attempt, out bool incrementAttempts);
if (isLastAttempt && handle)
{
    TelemetryUtil.ReportFinalExecutionAttempt(_telemetry, context, outcome, attempt, executionTime, handle);
}
else
{
    TelemetryUtil.ReportExecutionAttempt(_telemetry, context, outcome, attempt, executionTime, handle);
}

if (context.CancellationToken.IsCancellationRequested || isLastAttempt || !handle)
{
    return outcome;
}

@sander1095 Is this what you are looking for?

@sander1095
Copy link
Author

Sounds good, @peter-csala . I am pretty sure I did see resilience logged as error in non-error cases, but that might be related to Microsoft.Extensions.Http.Resilience and not Polly directly

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants