Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TaskCanceledException thrown from ReadItemStreamAsync #1589

Closed
paulhickman-a365 opened this issue Jun 3, 2020 · 6 comments
Closed

TaskCanceledException thrown from ReadItemStreamAsync #1589

paulhickman-a365 opened this issue Jun 3, 2020 · 6 comments
Labels
bug Something isn't working needs-investigation

Comments

@paulhickman-a365
Copy link

Describe the bug

When invoking ReadItemStreamAsync without passing a cancellation token, it returned an task cancelled exception. I assume a transient issue occurred to cause the error.

However, the bug I am reporting is that it threw the exception, rather returning a response that we can examine for the cause of the error or invoke ThrowExceptionIfUnsucessful on.

It may or may not have also bypassed the built-in retry logic for errors.

To Reproduce

This is intermittent and has only happened to us once in millions of executions of the same query pattern. This issue isn't that it fails, but it throws an unexpected exception type.

Expected behavior

A failed call to ReadItemStreamAsync should return a response with error information that can be turned into a CosmosException by calling ThrowExceptionIfUnsuccessful() when its internal tasks are cancelled.

Actual behavior

A call to ReadItemStreamAsync can sometimes throw a TaskCancelledException even though no cancellation token was passed in.

Environment summary

SDK Version: 3.8
OS Version: Azure Functions V3 host

We have updated our SDK to 3.9 but this is a rare issue and haven't reproduced it there.

Additional context
Our application logging captured a partial stack trace. It deletes lines that are just async continuation information.

at System.Net.Http.HttpConnection.SendAsyncCore(HttpRequestMessage request, CancellationToken cancellationToken)     
at System.Net.Http.HttpConnectionPool.SendWithNtConnectionAuthAsync(HttpConnection connection, HttpRequestMessage request, Boolean doRequestAuth, CancellationToken cancellationToken)     
at System.Net.Http.HttpConnectionPool.SendWithRetryAsync(HttpRequestMessage request, Boolean doRequestAuth, CancellationToken cancellationToken)    
at System.Net.Http.RedirectHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)     
at System.Net.Http.DiagnosticsHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)     
at Microsoft.Azure.Cosmos.DocumentClient.HttpRequestMessageHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)    
at System.Net.Http.HttpClient.FinishSendAsyncUnbuffered(Task`1 sendTask, HttpRequestMessage request, CancellationTokenSource cts, Boolean disposeCts)     
at Microsoft.Azure.Cosmos.ClientExtensions.GetAsync(HttpClient client, Uri uri, INameValueCollection additionalHeaders, CancellationToken cancellationToken)    
at Microsoft.Azure.Cosmos.Routing.GatewayAddressCache.GetServerAddressesViaGatewayAsync(DocumentServiceRequest request, String collectionRid, IEnumerable`1 partitionKeyRangeIds, Boolean forceRefresh)    
at Microsoft.Azure.Cosmos.Routing.GatewayAddressCache.GetAddressesForRangeIdAsync(DocumentServiceRequest request, String collectionRid, String partitionKeyRangeId, Boolean forceRefresh)     
at Microsoft.Azure.Cosmos.Common.AsyncCache`2.GetAsync(TKey key, TValue obsoleteValue, Func`1 singleValueInitFunc, CancellationToken cancellationToken, Boolean forceRefresh)    
at Microsoft.Azure.Cosmos.Routing.GatewayAddressCache.TryGetAddressesAsync(DocumentServiceRequest request, PartitionKeyRangeIdentity partitionKeyRangeIdentity, ServiceIdentity serviceIdentity, Boolean forceRefreshPartitionAddresses, CancellationToken cancellationToken)
at Microsoft.Azure.Cosmos.AddressResolver.TryResolveServerPartitionAsync(DocumentServiceRequest request, ContainerProperties collection, CollectionRoutingMap routingMap, Boolean collectionCacheIsUptodate, Boolean collectionRoutingMapCacheIsUptodate, Boolean forceRefreshPartitionAddresses, CancellationToken cancellationToken)    
at Microsoft.Azure.Cosmos.AddressResolver.ResolveAddressesAndIdentityAsync(DocumentServiceRequest request, Boolean forceRefreshPartitionAddresses, CancellationToken cancellationToken)    
at Microsoft.Azure.Cosmos.AddressResolver.ResolveAsync(DocumentServiceRequest request, Boolean forceRefreshPartitionAddresses, CancellationToken cancellationToken)    
at Microsoft.Azure.Documents.AddressSelector.ResolveAddressesAsync(DocumentServiceRequest request, Boolean forceAddressRefresh)    
at Microsoft.Azure.Documents.AddressSelector.ResolveAllUriAsync(DocumentServiceRequest request, Boolean includePrimary, Boolean forceRefresh)   
at Microsoft.Azure.Documents.StoreReader.ReadMultipleReplicasInternalAsync(DocumentServiceRequest entity, Boolean includePrimary, Int32 replicaCountToRead, Boolean requiresValidLsn, Boolean useSessionToken, ReadMode readMode, Boolean checkMinLSN, Boolean forceReadAll)  
at Microsoft.Azure.Documents.StoreReader.ReadMultipleReplicaAsync(DocumentServiceRequest entity, Boolean includePrimary, Int32 replicaCountToRead, Boolean requiresValidLsn, Boolean useSessionToken, ReadMode readMode, Boolean checkMinLSN, Boolean forceReadAll)     
at Microsoft.Azure.Documents.QuorumReader.ReadQuorumAsync(DocumentServiceRequest entity, Int32 readQuorum, Boolean includePrimary, ReadMode readMode)    
at Microsoft.Azure.Documents.QuorumReader.ReadStrongAsync(DocumentServiceRequest entity, Int32 readQuorumValue, ReadMode readMode)  
at Microsoft.Azure.Documents.ReplicatedResourceClient.<>c__DisplayClass27_0.<<InvokeAsync>b__0>d.MoveNext()    
at Microsoft.Azure.Documents.RequestRetryUtility.ProcessRequestAsync[TRequest,IRetriableResponse](Func`1 executeAsync, Func`1 prepareRequest, IRequestRetryPolicy`2 policy, CancellationToken cancellationToken, Func`1 inBackoffAlternateCallbackMethod, Nullable`1 minBackoffForInBackoffCallback)   
at Microsoft.Azure.Documents.ShouldRetryResult.ThrowIfDoneTrying(ExceptionDispatchInfo capturedException)    
at Microsoft.Azure.Documents.RequestRetryUtility.ProcessRequestAsync[TRequest,IRetriableResponse](Func`1 executeAsync, Func`1 prepareRequest, IRequestRetryPolicy`2 policy, CancellationToken cancellationToken, Func`1 inBackoffAlternateCallbackMethod, Nullable`1 minBackoffForInBackoffCallback) 
at Microsoft.Azure.Documents.StoreClient.ProcessMessageAsync(DocumentServiceRequest request, CancellationToken cancellationToken, IRetryPolicy retryPolicy, Func`2 prepareRequestAsyncDelegate)    
at Microsoft.Azure.Documents.ServerStoreModel.ProcessMessageAsync(DocumentServiceRequest request, CancellationToken cancellationToken)    
at Microsoft.Azure.Cosmos.Handlers.TransportHandler.SendAsync(RequestMessage request, CancellationToken cancellationToken)   
at Microsoft.Azure.Cosmos.RequestHandler.SendAsync(RequestMessage request, CancellationToken cancellationToken)    
at Microsoft.Azure.Cosmos.Handlers.AbstractRetryHandler.ExecuteHttpRequestAsync(Func`1 callbackMethod, Func`3 callShouldRetry, Func`3 callShouldRetryException, CosmosDiagnosticsContext diagnosticsContext, CancellationToken cancellationToken)    
at Microsoft.Azure.Cosmos.Handlers.AbstractRetryHandler.SendAsync(RequestMessage request, CancellationToken cancellationToken)    
at Microsoft.Azure.Cosmos.RequestHandler.SendAsync(RequestMessage request, CancellationToken cancellationToken)    
at Microsoft.Azure.Cosmos.Handlers.RequestInvokerHandler.SendAsync(RequestMessage request, CancellationToken cancellationToken)    
at Microsoft.Azure.Cosmos.Handlers.RequestInvokerHandler.SendAsync(Uri resourceUri, ResourceType resourceType, OperationType operationType, RequestOptions requestOptions, ContainerCore cosmosContainerCore, Nullable`1 partitionKey, Stream streamPayload, Action`1 requestEnricher, CosmosDiagnosticsContext diagnosticsContext, CancellationToken cancellationToken)    
at Microsoft.Azure.Cosmos.ContainerCore.ProcessItemStreamAsync(Nullable`1 partitionKey, String itemId, Stream streamPayload, OperationType operationType, ItemRequestOptions requestOptions, CosmosDiagnosticsContext diagnosticsContext, CancellationToken cancellationToken)    
at Microsoft.Azure.Cosmos.ContainerCore.ReadItemStreamAsync(String id, PartitionKey partitionKey, ItemRequestOptions requestOptions, CancellationToken cancellationToken)    

@paulhickman-a365 paulhickman-a365 changed the title HTTP Exception not wrapped in a CosmosException TaskCanceledException thrown from ReadItemStreamAsync Jun 3, 2020
@j82w j82w added bug Something isn't working needs-investigation labels Jun 3, 2020
@paulhickman-a365
Copy link
Author

I've now also seen an OperationCancelledExcpetion being thrown. In this case, our application didn't record the full stack trace so i can't tell which method was being called but it could only have been ReadItemStreamAsync or FeedIterator.ReadNextAsync on the results of a call to GetItemQueryStreamIterator

 at System.Threading.CancellationToken.ThrowOperationCanceledException()     
 at System.Threading.CancellationToken.ThrowIfCancellationRequested()     
 at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.ThrowException(SocketError error, CancellationToken cancellationToken)     
 at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.GetResult(Int16 token)     
 at System.Net.FixedSizeReader.ReadPacketAsync(Stream transport, AsyncProtocolRequest request)     
 at System.Net.Security.SslStream.ThrowIfExceptional()     
 at System.Net.Security.SslStream.InternalEndProcessAuthentication(LazyAsyncResult lazyResult)     
 at System.Net.Security.SslStream.EndProcessAuthentication(IAsyncResult result)     
 at System.Net.Security.SslStream.EndAuthenticateAsClient(IAsyncResult asyncResult)     
 at System.Net.Security.SslStream.<>c.<AuthenticateAsClientAsync>b__65_1(IAsyncResult iar)     
 at System.Threading.Tasks.TaskFactory`1.FromAsyncCoreLogic(IAsyncResult iar, Func`2 endFunction, Action`1 endAction, Task`1 promise, Boolean requiresSynchronization)     
 at System.Net.Http.ConnectHelper.EstablishSslConnectionAsyncCore(Stream stream, SslClientAuthenticationOptions sslOptions, CancellationToken cancellationToken)    
 at System.Net.Http.HttpConnectionPool.ConnectAsync(HttpRequestMessage request, Boolean allowHttp2, CancellationToken cancellationToken)    
 at System.Net.Http.HttpConnectionPool.CreateHttp11ConnectionAsync(HttpRequestMessage request, CancellationToken cancellationToken)    
 at System.Net.Http.HttpConnectionPool.GetHttpConnectionAsync(HttpRequestMessage request, CancellationToken cancellationToken)    
 at System.Net.Http.HttpConnectionPool.SendWithRetryAsync(HttpRequestMessage request, Boolean doRequestAuth, CancellationToken cancellationToken)    
 at System.Net.Http.RedirectHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)    
 at System.Net.Http.DiagnosticsHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)    
 at Microsoft.Azure.Cosmos.DocumentClient.HttpRequestMessageHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)    
 at System.Net.Http.HttpClient.FinishSendAsyncUnbuffered(Task`1 sendTask, HttpRequestMessage request, CancellationTokenSource cts, Boolean disposeCts)    
 at Microsoft.Azure.Cosmos.ClientExtensions.GetAsync(HttpClient client, Uri uri, INameValueCollection additionalHeaders, CancellationToken cancellationToken)    
 at Microsoft.Azure.Cosmos.Routing.GatewayAddressCache.GetServerAddressesViaGatewayAsync(DocumentServiceRequest request, String collectionRid, IEnumerable`1 partitionKeyRangeIds, Boolean forceRefresh)    
 at Microsoft.Azure.Cosmos.Routing.GatewayAddressCache.GetAddressesForRangeIdAsync(DocumentServiceRequest request, String collectionRid, String partitionKeyRangeId, Boolean forceRefresh)    
 at Microsoft.Azure.Cosmos.Common.AsyncCache`2.GetAsync(TKey key, TValue obsoleteValue, Func`1 singleValueInitFunc, CancellationToken cancellationToken, Boolean forceRefresh)      

@ealsur
Copy link
Member

ealsur commented Jun 8, 2020

All these exceptions point at connectivity issues, looking at the stack they are all coming from the System.Net.Http.HttpClient. HttpClient throws a TaskCanceledException when the timeout happens (which in this case seems to be coming from the CosmosClientOptions.RequestTimeout). References:

The OperationCanceledException has the same source (a timeout).

Do you have a singleton CosmosClient in your application? Or multiple instances? Are you by any chance running into some of the scenarios described in: https://docs.microsoft.com/en-us/azure/cosmos-db/troubleshoot-dot-net-sdk#request-timeouts

@paulhickman-a365
Copy link
Author

The timeouts were being caused by a hot partition.

However, the bug I was reporting wasn't that these errors occurred, but that when they did occur, the they weren't trapped in the SDK code and transformed into a response with the error details on it.

@ealsur
Copy link
Member

ealsur commented Jun 9, 2020

I think that for these types (OperationCanceledException and TaskCanceledException) we are not wrapping it on a CosmosException on purpose, right @j82w?. There was even an ask in a previous issue about it.
The fact that HttpClient also uses these to indicate a timeout because it internally converts the timeout value into a CancellationToken is an implementation detail of the HttpClient, but TaskCanceledException/OperationCanceledException are naturally thrown by the SDK if you do pass a CancellationToken with some value.

@j82w
Copy link
Contributor

j82w commented Jun 9, 2020

This seems like a bug and we should be converting this to a 408 request timeout. The gateway store already does this. I think it provides a more consistent and better user experience if it gets converted.

@ealsur
Copy link
Member

ealsur commented Nov 11, 2020

Closing as TaskCanceledExceptions as the SDK now wraps them with a request timeout

catch (OperationCanceledException ex)

@ealsur ealsur closed this as completed Nov 11, 2020
@ealsur ealsur reopened this May 1, 2023
@ealsur ealsur closed this as completed May 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working needs-investigation
Projects
None yet
Development

No branches or pull requests

3 participants