Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include troubleshooting guide link in exception messages #31155

Merged
merged 2 commits into from
Sep 15, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
44 changes: 2 additions & 42 deletions sdk/servicebus/Azure.Messaging.ServiceBus/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ Use the client library for Azure Service Bus to:

- Implement complex workflows: message sessions support scenarios that require message ordering or message deferral.

[Source code](https://github.com/Azure/azure-sdk-for-net/tree/main/sdk/servicebus/Azure.Messaging.ServiceBus/src) | [Package (NuGet)](https://www.nuget.org/packages/Azure.Messaging.ServiceBus/) | [API reference documentation](https://docs.microsoft.com/dotnet/api/azure.messaging.servicebus) | [Product documentation](https://docs.microsoft.com/azure/service-bus/) | [Migration guide](https://github.com/Azure/azure-sdk-for-net/blob/main/sdk/servicebus/Azure.Messaging.ServiceBus/MigrationGuide.md)
[Source code](https://github.com/Azure/azure-sdk-for-net/tree/main/sdk/servicebus/Azure.Messaging.ServiceBus/src) | [Package (NuGet)](https://www.nuget.org/packages/Azure.Messaging.ServiceBus/) | [API reference documentation](https://docs.microsoft.com/dotnet/api/azure.messaging.servicebus) | [Product documentation](https://docs.microsoft.com/azure/service-bus/) | [Migration guide](https://github.com/Azure/azure-sdk-for-net/blob/main/sdk/servicebus/Azure.Messaging.ServiceBus/MigrationGuide.md) | [Troubleshooting guide](https://github.com/Azure/azure-sdk-for-net/blob/main/sdk/servicebus/Azure.Messaging.ServiceBus/TROUBLESHOOTING.md)

## Getting started

Expand Down Expand Up @@ -427,47 +427,7 @@ ServiceBusClient client = new ServiceBusClient(fullyQualifiedNamespace, new Defa

## Troubleshooting

### Exception handling

#### Service Bus Exception

A `ServiceBusException` is triggered when an operation specific to Service Bus has encountered an issue, including both errors within the service and specific to the client. The exception includes some contextual information to assist in understanding the context of the error and its relative severity. These are:

- `IsTransient` : This identifies whether or not the exception is considered recoverable. In the case where it was deemed transient, the appropriate retry policy has already been applied and retries were unsuccessful.

- `Reason` : Provides a set of well-known reasons for the failure that help to categorize and clarify the root cause. These are intended to allow for applying exception filtering and other logic where inspecting the text of an exception message wouldn't be ideal. Some key failure reasons are:

- **Service Timeout** : This indicates that the Service Bus service did not respond to an operation within the expected amount of time. This may have been caused by a transient network issue or service problem. The Service Bus service may or may not have successfully completed the request; the status is not known. It is recommended to attempt to verify the current state and retry if necessary.

- **Message Lock Lost** : This can occur if the processing takes longer than the lock duration specified at the entity level for a message. If this error occurs consistently, it may be worth increasing the message lock duration. Otherwise, callers can renew the message lock while they are processing the message to ensure that this error doesn't occur.

- **Messaging Entity Not Found**: A Service Bus resource, such as a queue, topic, or subscription could not be found by the Service Bus service. This may indicate that it has been deleted from the service or that there is an issue with the Service Bus service itself.

Reacting to a specific failure reason for the `ServiceBusException` can be accomplished in several ways, such as by applying an exception filter clause as part of the `catch` block:

```C# Snippet:ServiceBusExceptionFailureReasonUsage
try
{
// Receive messages using the receiver client
}
catch (ServiceBusException ex) when
(ex.Reason == ServiceBusFailureReason.ServiceTimeout)
{
// Take action based on a service timeout
}
```

#### Other exceptions

For detailed information about the failures represented by the `ServiceBusException` and other exceptions that may occur, please refer to [Service Bus messaging exceptions](https://docs.microsoft.com/azure/service-bus-messaging/service-bus-messaging-exceptions).

### Logging and diagnostics

The Service Bus client library is fully instrumented for logging information at various levels of detail using the .NET `EventSource` to emit information. Logging is performed for each operation and follows the pattern of marking the starting point of the operation and either it's completion or exceptions encountered. Additional information that may offer insight is also logged in the context of the associated operation.

The Service Bus client logs are available to any `EventListener` by opting into the source named "Azure-Messaging-ServiceBus" or opting into all sources that have the trait "AzureEventSource". To make capturing logs from the Azure client libraries easier, the `Azure.Core` library used by Service Bus offers an `AzureEventSourceListener`. More information can be found in the [Azure.Core Diagnostics sample](https://github.com/Azure/azure-sdk-for-net/blob/main/sdk/core/Azure.Core/samples/Diagnostics.md#logging).

The Service Bus client library is also instrumented for distributed tracing using Application Insights or OpenTelemetry. More information can be found in the [Azure.Core Diagnostics sample](https://github.com/Azure/azure-sdk-for-net/blob/main/sdk/core/Azure.Core/samples/Diagnostics.md#distributed-tracing).
Please refer to the [Service Bus Troubleshooting Guide](https://github.com/Azure/azure-sdk-for-net/blob/main/sdk/servicebus/Azure.Messaging.ServiceBus/TROUBLESHOOTING.md).

## Next steps

Expand Down
14 changes: 9 additions & 5 deletions sdk/servicebus/Azure.Messaging.ServiceBus/TROUBLESHOOTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,20 +60,24 @@ The exception includes some contextual information to assist in understanding th

- `Reason` : Provides a set of well-known reasons for the failure that help to categorize and clarify the root cause. These are intended to allow for applying exception filtering and other logic where inspecting the text of an exception message wouldn't be ideal. Some key failure reasons are:

- **Service Timeout** : This indicates that the Service Bus service did not respond to an operation within the expected amount of time. This may have been caused by a transient network issue or service problem. The Service Bus service may or may not have successfully completed the request; the status is not known. In the case of accepting the next available session, this exception indicates that there were no unlocked sessions available in the entity. These are transient errors that will be automatically retried.
- **ServiceTimeout** : This indicates that the Service Bus service did not respond to an operation within the expected amount of time. This may have been caused by a transient network issue or service problem. The Service Bus service may or may not have successfully completed the request; the status is not known. In the case of accepting the next available session, this exception indicates that there were no unlocked sessions available in the entity. These are transient errors that will be automatically retried.

- **Quota Exceeded** : This typically indicates that there are too many active receive operations for a single entity. In order to avoid this error, reduce the number of potential concurrent receives. You can use batch receives to attempt to receive multiple messages per receive request. Please see [Service Bus quotas][ServiceBusQuotas] for more information.
- **QuotaExceeded** : This typically indicates that there are too many active receive operations for a single entity. In order to avoid this error, reduce the number of potential concurrent receives. You can use batch receives to attempt to receive multiple messages per receive request. Please see [Service Bus quotas][ServiceBusQuotas] for more information.

- **Message Size Exceeded** : This indicates that the max message size has been exceeded. The message size includes the body of the message, as well as any associated metadata and system overhead. The best approach for resolving this error is to reduce the number of messages being sent in a batch or the size of the body included in the message. Because size limits are subject to change, please refer to [Service Bus quotas][ServiceBusQuotas] for specifics.
- **MessageSizeExceeded** : This indicates that the max message size has been exceeded. The message size includes the body of the message, as well as any associated metadata and system overhead. The best approach for resolving this error is to reduce the number of messages being sent in a batch or the size of the body included in the message. Because size limits are subject to change, please refer to [Service Bus quotas][ServiceBusQuotas] for specifics.

- **MessageLockLost** : This indicates that the lock on the message is lost. Callers should attempt to receive and process the message again. This only applies to non-session entities. This error occurs if processing takes longer than the lock duration and the message lock is not renewed. Note that this error can also occur when the link is detached due to a transient network issue or when the link is idle for 10 minutes.
- **MessageLockLost** : This indicates that the lock on the message is lost. Callers should attempt to receive and process the message again. This only applies to non-session entities. This error occurs if processing takes longer than the lock duration and the message lock is not renewed. Note that this error can also occur when the link is detached due to a transient network issue or when the link is idle for 10 minutes. See [Message or session lock is lost before lock expiration time](#message-or-session-lock-is-lost-before-lock-expiration-time) for more information.

- **SessionLockLost**: This indicates that the lock on the session has expired. Callers should attempt to accept the session again. This only applies to session-enabled entities. This error occurs if processing takes longer than the lock duration and the session lock is not renewed. Note that this error can also occur when the link is detached due to a transient network issue or when the link is idle for 10 minutes.
- **SessionLockLost**: This indicates that the lock on the session has expired. Callers should attempt to accept the session again. This only applies to session-enabled entities. This error occurs if processing takes longer than the lock duration and the session lock is not renewed. Note that this error can also occur when the link is detached due to a transient network issue or when the link is idle for 10 minutes. See [Message or session lock is lost before lock expiration time](#message-or-session-lock-is-lost-before-lock-expiration-time) for more information.

- **MessageNotFound**: This occurs when attempting to receive a deferred message by sequence number for a message that either doesn't exist in the entity, or is currently locked.

- **SessionCannotBeLocked**: This indicates that the requested session cannot be locked because the lock is already held elsewhere. Once the lock expires, the session can be accepted.

- **GeneralError**: This indicates that the Service Bus service encountered an error while processing the request. This is often caused by service upgrades and restarts. These are transient errors that will be automatically retried.

- **ServiceCommunicationProblem**: This indicates that there was an error communicating with the service. The issue may stem from a transient network problem, or a service problem. These are transient errors that will be automatically retried.

### Other common exceptions

- **ArgumentException** : An exception deriving from `ArgumentException` is thrown by clients when a parameter provided when interacting with the client is invalid. Information about the specific parameter and the nature of the problem can be found in the `Message`.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -109,18 +109,18 @@ public static Exception ToMessagingContractException(string condition, string me

if (string.Equals(condition, AmqpErrorCode.NotImplemented.Value, StringComparison.InvariantCultureIgnoreCase))
{
return new NotSupportedException(message);
return new NotSupportedException(EnrichMessage(message));
}

if (string.Equals(condition, AmqpErrorCode.NotAllowed.Value, StringComparison.InvariantCultureIgnoreCase))
{
return new InvalidOperationException(message);
return new InvalidOperationException(EnrichMessage(message));
}

if (string.Equals(condition, AmqpErrorCode.UnauthorizedAccess.Value, StringComparison.InvariantCultureIgnoreCase) ||
string.Equals(condition, AmqpClientConstants.AuthorizationFailedError.Value, StringComparison.InvariantCultureIgnoreCase))
{
return new UnauthorizedAccessException(message);
return new UnauthorizedAccessException(EnrichMessage(message));
}

if (string.Equals(condition, AmqpClientConstants.ServerBusyError.Value, StringComparison.InvariantCultureIgnoreCase))
Expand All @@ -130,12 +130,12 @@ public static Exception ToMessagingContractException(string condition, string me

if (string.Equals(condition, AmqpClientConstants.ArgumentError.Value, StringComparison.InvariantCultureIgnoreCase))
{
return new ArgumentException(message);
return new ArgumentException(EnrichMessage(message));
}

if (string.Equals(condition, AmqpClientConstants.ArgumentOutOfRangeError.Value, StringComparison.InvariantCultureIgnoreCase))
{
return new ArgumentOutOfRangeException(message);
return new ArgumentOutOfRangeException(EnrichMessage(message));
}

if (string.Equals(condition, AmqpClientConstants.EntityDisabledError.Value, StringComparison.InvariantCultureIgnoreCase))
Expand Down Expand Up @@ -278,5 +278,7 @@ public static Exception GetInnerException(this AmqpObject amqpObject)

return innerException == null ? null : TranslateException(innerException, null, null, connectionError);
}

private static string EnrichMessage(string message) => $"{message}{Environment.NewLine}{Constants.TroubleshootingMessage}";
}
}
7 changes: 7 additions & 0 deletions sdk/servicebus/Azure.Messaging.ServiceBus/src/Constants.cs
Original file line number Diff line number Diff line change
Expand Up @@ -49,5 +49,12 @@ internal static class Constants
public const int WellKnownPublicPortsLimit = 1023;

public const string DefaultScope = "https://servicebus.azure.net/.default";

/// <summary>
/// The message appended to exceptions returned from the service that contains a link to the troubleshooting guide.
/// Usage errors with obvious causes do not contain this message.
/// </summary>
public const string TroubleshootingMessage =
"For troubleshooting information, see https://aka.ms/azsdk/net/servicebus/exceptions/troubleshoot.";
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -12,30 +12,26 @@ namespace Azure.Messaging.ServiceBus
/// </summary>
///
/// <seealso cref="System.Exception" />
///
public class ServiceBusException : Exception
{
/// <summary>
/// Indicates whether an exception should be considered transient or final.
/// </summary>
///
/// <value><c>true</c> if the exception is likely transient; otherwise, <c>false</c>.</value>
///
public bool IsTransient { get; }

/// <summary>
/// The reason for the failure of an Service Bus operation that resulted
/// in the exception.
/// </summary>
///
public ServiceBusFailureReason Reason { get; }

/// <summary>
/// The name of the Service Bus to which the exception is associated.
/// </summary>
///
/// <value>The name of the Service Bus entity, if available; otherwise, <c>null</c>.</value>
///
public string EntityPath { get; }

/// <summary>
Expand All @@ -46,7 +42,6 @@ public class ServiceBusException : Exception
/// <summary>
/// Gets a message that describes the current exception.
/// </summary>
///
public override string Message
JoshLove-msft marked this conversation as resolved.
Show resolved Hide resolved
{
get
Expand All @@ -55,16 +50,18 @@ public override string Message
{
return string.Format(
CultureInfo.InvariantCulture,
"{0} ({1})",
"{0} ({1}). {2}",
base.Message,
Reason);
Reason,
Constants.TroubleshootingMessage);
}
return string.Format(
CultureInfo.InvariantCulture,
"{0} ({1} - {2})",
"{0} ({1} - {2}). {3}",
base.Message,
EntityPath,
Reason);
Reason,
Constants.TroubleshootingMessage);
}
}

Expand All @@ -77,7 +74,6 @@ public override string Message
/// <param name="reason">The reason for the failure that resulted in the exception.</param>
/// <param name="entityPath">The name of the Service Bus entity to which the exception is associated.</param>
/// <param name="innerException"></param>
///
public ServiceBusException(
string message,
ServiceBusFailureReason reason,
Expand Down Expand Up @@ -108,7 +104,6 @@ public ServiceBusException(
/// <param name="entityName">The name of the Service Bus entity to which the exception is associated.</param>
/// <param name="reason">The reason for the failure that resulted in the exception.</param>
/// <param name="innerException">The exception that is the cause of the current exception, or a null reference if no inner exception is specified.</param>
///
public ServiceBusException(bool isTransient,
string message,
string entityName = default,
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
// Copyright (c) Microsoft Corporation. All rights reserved.
// Licensed under the MIT License.

using System;
using System.Collections.Generic;
using NUnit.Framework;

namespace Azure.Messaging.ServiceBus.Tests
{
public class ServiceBusExceptionTests
{
public static IEnumerable<object> GetFailureReasons()
{
foreach (ServiceBusFailureReason reason in Enum.GetValues(typeof(ServiceBusFailureReason)))
{
yield return new object[] { reason, true };
yield return new object[] { reason, false };
}
}

[Test]
[TestCaseSource(nameof(GetFailureReasons))]
public void MessageIncludesTroubleshootingGuideLink(ServiceBusFailureReason reason, bool includeEntityPath)
{
var exception = new ServiceBusException("test", reason, entityPath: includeEntityPath ? "entityPath" : null);
StringAssert.Contains(Constants.TroubleshootingMessage, exception.Message);

// test the other constructor
exception = new ServiceBusException(true, "test", reason: reason);
StringAssert.Contains(Constants.TroubleshootingMessage, exception.Message);
}
}
}