Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Queries with sync-over-async hang #1043

Closed
lukasz-pyrzyk opened this issue Nov 22, 2019 · 16 comments · Fixed by #1078
Closed

Queries with sync-over-async hang #1043

lukasz-pyrzyk opened this issue Nov 22, 2019 · 16 comments · Fixed by #1078
Assignees
Labels
bug Something isn't working needs-investigation

Comments

@lukasz-pyrzyk
Copy link

lukasz-pyrzyk commented Nov 22, 2019

Describe the bug
I work on the migration from SDK v2 to SDK v3 in the ASP.NET web project which runs on the .NET Framework 4.6.1. Because of the contract of the 3rd party library, I'm forced to do sync-over-async calls.

After migrating to SDK v3, I notice that my app hangs on communication to the database. First query to the database completes, but any other hangs.

I was thinking that some `ConfigureAwait(false) is missing any my thread cannot continue the asynchronous operation, but it's added to all async calls.

Tested code on gateway and direct mode. Screenshots and call stack from direct mode below. Is it normal that is uses GatewayStoreClient under the hood? The name suggest that Direct mode uses gateway client.

To Reproduce
I cannot give you repro right now, I will try to do it this week.
Parts of the code:

class CosmosDbStatelessProvider
{
        public override bool AuthorizeServerRequest(Guid clientId, Guid domainKey, out Guid? sessionId, out Bindings? bindings, out Subscriptions? subscriptions)
        {
            var client = _repository.GetClient(clientId, domainKey).ConfigureAwait(false).GetAwaiter().GetResult();
            if (client is null)
            {
                [...]
                return false;
            }

            [..]
            return true;
        }
}

class CosmosDbRepository
{
        private readonly AsyncLazy<Container> _containerFactory;
        private readonly CosmosDBSettings _settings;
        private readonly StoredProcedures _sp;
        private readonly StoredProcedureExecutor _spExecutor;
        private readonly CosmosClient _databaseClient;

            _settings = settings;
            _sp = new StoredProcedures();
            var options = new CosmosClientOptions
            {
                ConnectionMode = _settings.UseDirectConnectionMode ? ConnectionMode.Direct : ConnectionMode.Gateway,
                ApplicationRegion = _settings.Region
            };

            _databaseClient = new CosmosClient(_settings.Endpoint, _settings.Key, options);
            _containerFactory = new AsyncLazy<Container>(async () =>
            {
                Logger.Debug("Preparing database, container and stored procedures");
                await CheckIfDatabaseExists(_databaseClient).ConfigureAwait(false);
                var container = await CheckIfContainerExists(_databaseClient).ConfigureAwait(false);
                if (_settings.PrepareStoredProcedures)
                {
                    await PrepareStoredProcedures(container).ConfigureAwait(false);
                }

                Logger.Debug("Database is ready");
                return container;
            });

            _spExecutor = new StoredProcedureExecutor(_containerFactory);

        internal async Task<ClientEntity?> GetClient(Guid clientId, Guid domainKey)
        {
            try
            {
                var dbClient = await _containerFactory;
                var partitionKey = new PartitionKey(clientId.ToString());
                var query = dbClient.GetItemLinqQueryable<ClientEntity>(requestOptions: new QueryRequestOptions { PartitionKey = partitionKey })
                        .Where(x => x.EntityType == EntityType.Client && x.ClientId == clientId && x.DomainKey == domainKey)
                        .ToFeedIterator();

                var client = await query.ExecuteFirstOrDefault(nameof(GetClient), Logger).ConfigureAwait(false);
                return client;
            }
            catch (CosmosException ex)
            {
                return null;
            }
        }
}

static class DocumentQueryExtensions 
{
        internal static async Task<TQueryItem> ExecuteFirstOrDefault<TQueryItem>(this FeedIterator<TQueryItem> iterator, string operationName, ILogger logger)
        {
            logger.Verbose("Executing {OperationName}", operationName);
            var feedResponse = await iterator.ReadNextAsync().ConfigureAwait(false);
            logger.Debug("{OperationName} finished with status code {StatusCode}. Request charge is {RequestCharge}.", operationName, feedResponse.StatusCode, feedResponse.RequestCharge);

            return feedResponse.FirstOrDefault();
        }

}

Expected behavior
All DB call, even with sync-over-async should be succeeded. SDK v2 worked correctly.

Actual behavior
Only the first query to the database completes, any other is hanging for infinite amount of time.

Environment summary
SDK Version: 3.4.1
OS Version: Windows 10

Additional context
parallel stacks

List of all tasks:

Not Flagged		803	Awaiting	Microsoft.Azure.Documents.Rntbd.Connection.ReadResponseMetadataAsync(args)	Microsoft.Azure.Documents.Rntbd.Connection.ReadResponseMetadataAsync(args)
Not Flagged		800	Awaiting	Microsoft.Azure.Documents.Rntbd.Connection.ReadPayloadAsync(length, type, args)	Microsoft.Azure.Documents.Rntbd.Connection.ReadPayloadAsync(length, type, args)
Not Flagged		797	Awaiting	Microsoft.Azure.Documents.Rntbd.Dispatcher.OpenAsync.AnonymousMethod__23_0()	Microsoft.Azure.Documents.Rntbd.Dispatcher.OpenAsync.AnonymousMethod__23_0()
Not Flagged		796	Awaiting	Microsoft.Azure.Documents.Rntbd.Dispatcher.ReceiveLoopAsync()	Microsoft.Azure.Documents.Rntbd.Dispatcher.ReceiveLoopAsync()
Not Flagged		788	Awaiting	System.Threading.Tasks.ContinuationTaskFromTask.InnerInvoke()	System.Threading.Tasks.ContinuationTaskFromTask.InnerInvoke()
Not Flagged		787	Awaiting	System.Threading.Tasks.UnwrapPromise<TResult>.Invoke(completingTask)	System.Threading.Tasks.UnwrapPromise<TResult>.Invoke(completingTask)
Not Flagged		774	Awaiting	Microsoft.Azure.Documents.Rntbd.Connection.ReadResponseMetadataAsync(args)	Microsoft.Azure.Documents.Rntbd.Connection.ReadResponseMetadataAsync(args)
Not Flagged		772	Awaiting	Microsoft.Azure.Documents.Rntbd.Connection.ReadPayloadAsync(length, type, args)	Microsoft.Azure.Documents.Rntbd.Connection.ReadPayloadAsync(length, type, args)
Not Flagged		711	Awaiting	Microsoft.Azure.Cosmos.ComparableTaskScheduler.ScheduleAsync()	Microsoft.Azure.Cosmos.ComparableTaskScheduler.ScheduleAsync()
Not Flagged		710	Awaiting	Microsoft.Azure.Cosmos.Collections.Generic.AsyncCollection<T>.TakeAsync(token)	Microsoft.Azure.Cosmos.Collections.Generic.AsyncCollection<T>.TakeAsync(token)
Not Flagged		709	Awaiting	System.Threading.SemaphoreSlim.WaitUntilCountOrTimeoutAsync(asyncWaiter, millisecondsTimeout, cancellationToken)	System.Threading.SemaphoreSlim.WaitUntilCountOrTimeoutAsync(asyncWaiter, millisecondsTimeout, cancellationToken)
Not Flagged		708	Awaiting	System.Threading.Tasks.TaskFactory.CompleteOnInvokePromise.Invoke(completingTask)	System.Threading.Tasks.TaskFactory.CompleteOnInvokePromise.Invoke(completingTask)
Not Flagged		648	Awaiting	Microsoft.Azure.Documents.Rntbd.Connection.ReadResponseMetadataAsync(args)	Microsoft.Azure.Documents.Rntbd.Connection.ReadResponseMetadataAsync(args)
Not Flagged		646	Awaiting	Microsoft.Azure.Documents.Rntbd.Connection.ReadPayloadAsync(length, type, args)	Microsoft.Azure.Documents.Rntbd.Connection.ReadPayloadAsync(length, type, args)
Not Flagged		636	Awaiting	Microsoft.Azure.Documents.Rntbd.Dispatcher.OpenAsync.AnonymousMethod__23_0()	Microsoft.Azure.Documents.Rntbd.Dispatcher.OpenAsync.AnonymousMethod__23_0()
Not Flagged		632	Awaiting	Microsoft.Azure.Documents.Rntbd.Dispatcher.ReceiveLoopAsync()	Microsoft.Azure.Documents.Rntbd.Dispatcher.ReceiveLoopAsync()
Not Flagged		627	Awaiting	System.Threading.Tasks.ContinuationTaskFromTask.InnerInvoke()	System.Threading.Tasks.ContinuationTaskFromTask.InnerInvoke()
Not Flagged		626	Awaiting	System.Threading.Tasks.UnwrapPromise<TResult>.Invoke(completingTask)	System.Threading.Tasks.UnwrapPromise<TResult>.Invoke(completingTask)
Not Flagged		624	Awaiting	Microsoft.Azure.Documents.Rntbd.Dispatcher.OpenAsync.AnonymousMethod__23_0()	Microsoft.Azure.Documents.Rntbd.Dispatcher.OpenAsync.AnonymousMethod__23_0()
Not Flagged		623	Awaiting	Microsoft.Azure.Documents.Rntbd.Dispatcher.ReceiveLoopAsync()	Microsoft.Azure.Documents.Rntbd.Dispatcher.ReceiveLoopAsync()
Not Flagged		611	Awaiting	System.Threading.Tasks.ContinuationTaskFromTask.InnerInvoke()	System.Threading.Tasks.ContinuationTaskFromTask.InnerInvoke()
Not Flagged		610	Awaiting	System.Threading.Tasks.UnwrapPromise<TResult>.Invoke(completingTask)	System.Threading.Tasks.UnwrapPromise<TResult>.Invoke(completingTask)
Not Flagged		527	Awaiting	Microsoft.Azure.Cosmos.ComparableTaskScheduler.ScheduleAsync()	Microsoft.Azure.Cosmos.ComparableTaskScheduler.ScheduleAsync()
Not Flagged		526	Awaiting	Microsoft.Azure.Cosmos.Collections.Generic.AsyncCollection<T>.TakeAsync(token)	Microsoft.Azure.Cosmos.Collections.Generic.AsyncCollection<T>.TakeAsync(token)
Not Flagged		525	Awaiting	System.Threading.SemaphoreSlim.WaitUntilCountOrTimeoutAsync(asyncWaiter, millisecondsTimeout, cancellationToken)	System.Threading.SemaphoreSlim.WaitUntilCountOrTimeoutAsync(asyncWaiter, millisecondsTimeout, cancellationToken)
Not Flagged		524	Awaiting	System.Threading.Tasks.TaskFactory.CompleteOnInvokePromise.Invoke(completingTask)	System.Threading.Tasks.TaskFactory.CompleteOnInvokePromise.Invoke(completingTask)
Not Flagged		469	Awaiting	Contoso.CacheProvider.Repository.CosmosDbRepository.GetClient(clientId, domainKey)	Contoso.CacheProvider.Repository.CosmosDbRepository.GetClient(clientId, domainKey)
Not Flagged		468	Awaiting	Contoso.CacheProvider.Repository.DocumentQueryExtensions.ExecuteFirstOrDefault<TQueryItem>(iterator, operationName, logger)	Contoso.CacheProvider.Repository.DocumentQueryExtensions.ExecuteFirstOrDefault<TQueryItem>(iterator, operationName, logger)
Not Flagged		467	Awaiting	Microsoft.Azure.Cosmos.FeedIteratorCore<T>.ReadNextAsync(cancellationToken)	Microsoft.Azure.Cosmos.FeedIteratorCore<T>.ReadNextAsync(cancellationToken)
Not Flagged		466	Awaiting	Microsoft.Azure.Cosmos.Query.QueryIterator.ReadNextAsync(cancellationToken)	Microsoft.Azure.Cosmos.Query.QueryIterator.ReadNextAsync(cancellationToken)
Not Flagged		465	Awaiting	Microsoft.Azure.Cosmos.Query.CosmosQueryExecutionContextFactory.ExecuteNextAsync(cancellationToken)	Microsoft.Azure.Cosmos.Query.CosmosQueryExecutionContextFactory.ExecuteNextAsync(cancellationToken)
Not Flagged		464	Awaiting	Microsoft.Azure.Cosmos.Query.CosmosQueryExecutionContextFactory.CreateItemQueryExecutionContextAsync(cancellationToken)	Microsoft.Azure.Cosmos.Query.CosmosQueryExecutionContextFactory.CreateItemQueryExecutionContextAsync(cancellationToken)
Not Flagged		463	Awaiting	Microsoft.Azure.Cosmos.CosmosQueryClientCore.ExecuteQueryPlanRequestAsync(resourceUri, resourceType, operationType, sqlQuerySpec, supportedQueryFeatures, cancellationToken)	Microsoft.Azure.Cosmos.CosmosQueryClientCore.ExecuteQueryPlanRequestAsync(resourceUri, resourceType, operationType, sqlQuerySpec, supportedQueryFeatures, cancellationToken)
Not Flagged		462	Awaiting	Microsoft.Azure.Cosmos.Handlers.RequestInvokerHandler.SendAsync(resourceUri, resourceType, operationType, requestOptions, cosmosContainerCore, partitionKey, streamPayload, requestEnricher, cancellationToken)	Microsoft.Azure.Cosmos.Handlers.RequestInvokerHandler.SendAsync(resourceUri, resourceType, operationType, requestOptions, cosmosContainerCore, partitionKey, streamPayload, requestEnricher, cancellationToken)
Not Flagged		461	Awaiting	Microsoft.Azure.Cosmos.Handlers.RequestInvokerHandler.SendAsync(request, cancellationToken)	Microsoft.Azure.Cosmos.Handlers.RequestInvokerHandler.SendAsync(request, cancellationToken)
Not Flagged		460	Awaiting	Microsoft.Azure.Cosmos.Handlers.AbstractRetryHandler.SendAsync(request, cancellationToken)	Microsoft.Azure.Cosmos.Handlers.AbstractRetryHandler.SendAsync(request, cancellationToken)
Not Flagged		459	Awaiting	Microsoft.Azure.Cosmos.Handlers.AbstractRetryHandler.ExecuteHttpRequestAsync(callbackMethod, callShouldRetry, callShouldRetryException, cancellationToken)	Microsoft.Azure.Cosmos.Handlers.AbstractRetryHandler.ExecuteHttpRequestAsync(callbackMethod, callShouldRetry, callShouldRetryException, cancellationToken)
Not Flagged		458	Awaiting	Microsoft.Azure.Cosmos.Handlers.TransportHandler.SendAsync(request, cancellationToken)	Microsoft.Azure.Cosmos.Handlers.TransportHandler.SendAsync(request, cancellationToken)
Not Flagged		457	Awaiting	Microsoft.Azure.Cosmos.GatewayStoreModel.ProcessMessageAsync(request, cancellationToken)	Microsoft.Azure.Cosmos.GatewayStoreModel.ProcessMessageAsync(request, cancellationToken)
Not Flagged		456	Awaiting	Microsoft.Azure.Cosmos.GatewayStoreClient.InvokeAsync(request, resourceType, physicalAddress, cancellationToken)	Microsoft.Azure.Cosmos.GatewayStoreClient.InvokeAsync(request, resourceType, physicalAddress, cancellationToken)
Not Flagged		455	Awaiting	Microsoft.Azure.Cosmos.GatewayStoreClient.InvokeClientAsync(request, resourceType, physicalAddress, cancellationToken)	Microsoft.Azure.Cosmos.GatewayStoreClient.InvokeClientAsync(request, resourceType, physicalAddress, cancellationToken)
Not Flagged		454	Awaiting	Microsoft.Azure.Documents.BackoffRetryUtility<T>.ExecuteRetryAsync(callbackMethod, callShouldRetry, inBackoffAlternateCallbackMethod, minBackoffForInBackoffCallback, cancellationToken, preRetryCallback)	Microsoft.Azure.Documents.BackoffRetryUtility<T>.ExecuteRetryAsync(callbackMethod, callShouldRetry, inBackoffAlternateCallbackMethod, minBackoffForInBackoffCallback, cancellationToken, preRetryCallback)
Not Flagged		452	Awaiting	Microsoft.Azure.Cosmos.GatewayStoreClient.InvokeClientAsync.AnonymousMethod__0()	Microsoft.Azure.Cosmos.GatewayStoreClient.InvokeClientAsync.AnonymousMethod__0()
Not Flagged		451	Awaiting	System.Threading.Tasks.ContinuationTaskFromResultTask<TAntecedentResult>.InnerInvoke()	System.Threading.Tasks.ContinuationTaskFromResultTask<TAntecedentResult>.InnerInvoke()
Not Flagged		435	Awaiting	Microsoft.Azure.Documents.Rntbd.Connection.ReadResponseMetadataAsync(args)	Microsoft.Azure.Documents.Rntbd.Connection.ReadResponseMetadataAsync(args)
Not Flagged		433	Awaiting	Microsoft.Azure.Documents.Rntbd.Connection.ReadPayloadAsync(length, type, args)	Microsoft.Azure.Documents.Rntbd.Connection.ReadPayloadAsync(length, type, args)
Not Flagged		430	Awaiting	Microsoft.Azure.Documents.Rntbd.Dispatcher.OpenAsync.AnonymousMethod__23_0()	Microsoft.Azure.Documents.Rntbd.Dispatcher.OpenAsync.AnonymousMethod__23_0()
Not Flagged		429	Awaiting	Microsoft.Azure.Documents.Rntbd.Dispatcher.ReceiveLoopAsync()	Microsoft.Azure.Documents.Rntbd.Dispatcher.ReceiveLoopAsync()
Not Flagged		421	Awaiting	System.Threading.Tasks.ContinuationTaskFromTask.InnerInvoke()	System.Threading.Tasks.ContinuationTaskFromTask.InnerInvoke()
Not Flagged		420	Awaiting	System.Threading.Tasks.UnwrapPromise<TResult>.Invoke(completingTask)	System.Threading.Tasks.UnwrapPromise<TResult>.Invoke(completingTask)
Not Flagged		198	Awaiting	Microsoft.Azure.Documents.Rntbd.Connection.ReadResponseMetadataAsync(args)	Microsoft.Azure.Documents.Rntbd.Connection.ReadResponseMetadataAsync(args)
Not Flagged		196	Awaiting	Microsoft.Azure.Documents.Rntbd.Connection.ReadPayloadAsync(length, type, args)	Microsoft.Azure.Documents.Rntbd.Connection.ReadPayloadAsync(length, type, args)
Not Flagged		193	Awaiting	Microsoft.Azure.Documents.Rntbd.Connection.ReadResponseMetadataAsync(args)	Microsoft.Azure.Documents.Rntbd.Connection.ReadResponseMetadataAsync(args)
Not Flagged		191	Awaiting	Microsoft.Azure.Documents.Rntbd.Connection.ReadPayloadAsync(length, type, args)	Microsoft.Azure.Documents.Rntbd.Connection.ReadPayloadAsync(length, type, args)
Not Flagged		177	Awaiting	Microsoft.Azure.Documents.Rntbd.Dispatcher.OpenAsync.AnonymousMethod__23_0()	Microsoft.Azure.Documents.Rntbd.Dispatcher.OpenAsync.AnonymousMethod__23_0()
Not Flagged		175	Awaiting	Microsoft.Azure.Documents.Rntbd.Dispatcher.ReceiveLoopAsync()	Microsoft.Azure.Documents.Rntbd.Dispatcher.ReceiveLoopAsync()
Not Flagged		166	Awaiting	System.Threading.Tasks.ContinuationTaskFromTask.InnerInvoke()	System.Threading.Tasks.ContinuationTaskFromTask.InnerInvoke()
Not Flagged		165	Awaiting	System.Threading.Tasks.UnwrapPromise<TResult>.Invoke(completingTask)	System.Threading.Tasks.UnwrapPromise<TResult>.Invoke(completingTask)
Not Flagged		164	Awaiting	System.Threading.Tasks.ContinuationTaskFromTask.InnerInvoke()	System.Threading.Tasks.ContinuationTaskFromTask.InnerInvoke()
Not Flagged		163	Awaiting	System.Threading.Tasks.UnwrapPromise<TResult>.Invoke(completingTask)	System.Threading.Tasks.UnwrapPromise<TResult>.Invoke(completingTask)
Not Flagged		802	Scheduled	[Scheduled and waiting to run]	
Not Flagged		773	Scheduled	[Scheduled and waiting to run]	
Not Flagged		707	Scheduled	[Scheduled and waiting to run]	Task.Delay
Not Flagged		647	Scheduled	[Scheduled and waiting to run]	
Not Flagged		523	Scheduled	[Scheduled and waiting to run]	Task.Delay
Not Flagged		480	Scheduled	[Scheduled and waiting to run]	Task.ContinueWith: <QueueAsynchronous>b__0
Not Flagged		453	Scheduled	[Scheduled and waiting to run]	
Not Flagged		449	Scheduled	[Scheduled and waiting to run]	Async: <SendAsync>d__3
Not Flagged		434	Scheduled	[Scheduled and waiting to run]	
Not Flagged		197	Scheduled	[Scheduled and waiting to run]	
Not Flagged		192	Scheduled	[Scheduled and waiting to run]	
Not Flagged		178	Scheduled	[Scheduled and waiting to run]	Async: <<OpenAsync>b__23_0>d
Not Flagged		43	Scheduled	[Scheduled and waiting to run]	Task.ContinueWith: <GetNormalApplicationInstance>b__34_0

threads

Stack trace of the hanging thread:

Contoso.CacheProvider.dll!Contoso.CacheProvider.CosmosDbStatelessProvider.AuthorizeServerRequest
mscorlib.dll!System.Threading.Monitor.Wait(object obj, int millisecondsTimeout, bool exitContext)
mscorlib.dll!System.Threading.Monitor.Wait(object obj, int millisecondsTimeout)
mscorlib.dll!System.Threading.ManualResetEventSlim.Wait(int millisecondsTimeout, System.Threading.CancellationToken cancellationToken)
mscorlib.dll!System.Threading.Tasks.Task.SpinThenBlockingWait(int millisecondsTimeout, System.Threading.CancellationToken cancellationToken)
mscorlib.dll!System.Threading.Tasks.Task.InternalWait(int millisecondsTimeout, System.Threading.CancellationToken cancellationToken)
mscorlib.dll!System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(System.Threading.Tasks.Task task)
mscorlib.dll!System.Runtime.CompilerServices.ConfiguredTaskAwaitable<System.__Canon>.ConfiguredTaskAwaiter.GetResult()
Contoso.CacheProvider.dll!Contoso.CacheProvider.CosmosDbStatelessProvider.AuthorizeServerRequest(System.Guid clientId, System.Guid domainKey, out System.Guid? sessionId, out FM.WebSync.Server.Bindings bindings, out FM.WebSync.Server.Subscriptions subscriptions) Line 285
FM.WebSync.Server.dll!FM.WebSync.Server.StatelessProvider._AuthorizeServerRequest(System.Guid clientId, System.Guid domainKey, out System.Guid? sessionId, out FM.WebSync.Server.Bindings bindings, out FM.WebSync.Server.Subscriptions subscriptions)
FM.WebSync.Server.dll!FM.WebSync.Server.RequestManager.Authorize(FM.WebSync.Message request, System.Guid domainKey, FM.WebSync.Server.EventSource eventSource, out FM.WebSync.Server.ClientReference clientReference)
FM.WebSync.Server.dll!FM.WebSync.Server.RequestManager.Authorize(FM.WebSync.Message[] requests, System.Guid domainKey, FM.WebSync.Server.EventSource eventSource, out FM.WebSync.Server.ClientReference[] clientReferences)
FM.WebSync.Server.dll!FM.WebSync.Server.RequestManager.HandleMultipleRequests(FM.WebSync.Server.EventSource eventSource, FM.WebSync.Server.EventType eventTypeBefore, FM.WebSync.Server.EventType eventTypeAfter, FM.WebSync.Server.State state, bool verifyExists, FM.WebSync.Server.Bindings bindings, ref FM.WebSync.Message[] requests, ref System.Guid domainKey, System.Func<FM.WebSync.Server.WebSyncEventArgs, FM.WebSync.Message> process, out FM.WebSync.Server.WebSyncEventArgs args)
FM.WebSync.Server.dll!FM.WebSync.Server.RequestManager.Bind(FM.WebSync.Server.EventSource eventSource, FM.WebSync.Server.State state, ref FM.WebSync.Message[] requests, ref System.Guid domainKey)
FM.WebSync.Server.dll!FM.WebSync.Server.RequestManager.ServerBind(System.Guid clientId, FM.WebSync.Record[] records, out FM.WebSync.Record[] boundRecords, System.Guid domainKey)
FM.WebSync.Server.dll!FM.WebSync.Server.RequestHandler.BindMany(System.Guid clientId, FM.WebSync.Record[] records, out FM.WebSync.Record[] boundRecords, bool notifyClient, System.Guid? domainKey)
FM.WebSync.Server.dll!FM.WebSync.Server.WebSyncClientReference.Bind(string key, string valueJson, out FM.WebSync.Record[] boundRecords)
FM.WebSync.Server.dll!FM.WebSync.Server.WebSyncClientReference.Bind(string key, string valueJson)
Contoso.Service.dll!Contoso.Service.Handlers.CommonEventHandler.AfterConnect(object sender, FM.WebSync.Server.WebSyncEventArgs e) Line 20
[Lightweight Function]
FM.WebSync.Server.dll!FM.WebSync.Server.RequestManager.Fire(FM.WebSync.Server.WebSyncEventMethodInfo info, FM.WebSync.Message[] messages, FM.WebSync.Server.WebSyncEventArgs args, bool isRequest, bool fireAndForget)
FM.WebSync.Server.dll!FM.WebSync.Server.RequestManager.FireEvent(FM.WebSync.Server.EventType eventType, FM.WebSync.Server.WebSyncEventArgs args, FM.WebSync.Message[] messages)
FM.WebSync.Server.dll!FM.WebSync.Server.RequestManager.RunAfterEvent(FM.WebSync.Server.EventType eventType, FM.WebSync.Server.WebSyncEventArgs args, FM.WebSync.Message[] responses)
FM.WebSync.Server.dll!FM.WebSync.Server.RequestManager.HandleSingleRequest(FM.WebSync.Server.EventSource eventSource, FM.WebSync.Server.EventType eventTypeBefore, FM.WebSync.Server.EventType eventTypeAfter, FM.WebSync.Server.State state, bool verifyExists, FM.WebSync.Server.Bindings bindings, ref FM.WebSync.Message request, ref System.Guid domainKey, System.Func<FM.WebSync.Server.WebSyncEventArgs, FM.WebSync.Message> process, out FM.WebSync.Server.WebSyncEventArgs args)
FM.WebSync.Server.dll!FM.WebSync.Server.RequestManager.Handshake(FM.WebSync.Server.EventSource eventSource, FM.WebSync.Server.State state, ref FM.WebSync.Message request, ref System.Guid domainKey)
FM.WebSync.Server.dll!FM.WebSync.Server.RequestManager.ClientHandshake(FM.WebSync.Message request, FM.WebSync.Server.State state)
FM.WebSync.Server.dll!FM.WebSync.Server.RequestManager.ProcessState(FM.WebSync.Server.State state, System.Action<FM.WebSync.Message[], FM.WebSync.Message[]> callback)
FM.WebSync.Server.dll!FM.WebSync.Server.RequestHandler.ProcessState(FM.WebSync.Server.State state, System.Action<FM.WebSync.Message[], FM.WebSync.Message[]> callback)
FM.WebSync.Server.dll!FM.WebSync.Server.RequestHandler.BeginProcessHttpRequest(System.Web.HttpContextBase context, System.AsyncCallback cb, object extraData, FM.Server.SessionAccessLevel sessionAccessLevel)
FM.WebSync.Server.dll!FM.WebSync.Server.RequestHandler.BeginProcessRequestBase(System.Web.HttpContextBase context, System.AsyncCallback cb, object extraData, FM.Server.SessionAccessLevel sessionAccessLevel)
FM.WebSync.Server.dll!FM.WebSync.Server.RequestHandler.BeginProcessRequest(System.Web.HttpContext context, System.AsyncCallback cb, object extraData)
System.Web.dll!System.Web.HttpApplication.CallHandlerExecutionStep.System.Web.HttpApplication.IExecutionStep.Execute()
System.Web.dll!System.Web.HttpApplication.ExecuteStepImpl(System.Web.HttpApplication.IExecutionStep step)
System.Web.dll!System.Web.HttpApplication.ExecuteStep(System.Web.HttpApplication.IExecutionStep step, ref bool completedSynchronously)
System.Web.dll!System.Web.HttpApplication.PipelineStepManager.ResumeSteps(System.Exception error)
System.Web.dll!System.Web.HttpApplication.BeginProcessRequestNotification(System.Web.HttpContext context, System.AsyncCallback cb)
System.Web.dll!System.Web.HttpRuntime.ProcessRequestNotificationPrivate(System.Web.Hosting.IIS7WorkerRequest wr, System.Web.HttpContext context)
System.Web.dll!System.Web.Hosting.PipelineRuntime.ProcessRequestNotificationHelper(System.IntPtr rootedObjectsPointer, System.IntPtr nativeRequestContext, System.IntPtr moduleData, int flags)
System.Web.dll!System.Web.Hosting.PipelineRuntime.ProcessRequestNotification(System.IntPtr rootedObjectsPointer, System.IntPtr nativeRequestContext, System.IntPtr moduleData, int flags)
[Native to Managed Transition]
[Managed to Native Transition]
System.Web.dll!System.Web.Hosting.PipelineRuntime.ProcessRequestNotificationHelper(System.IntPtr rootedObjectsPointer, System.IntPtr nativeRequestContext, System.IntPtr moduleData, int flags)
System.Web.dll!System.Web.Hosting.PipelineRuntime.ProcessRequestNotification(System.IntPtr rootedObjectsPointer, System.IntPtr nativeRequestContext, System.IntPtr moduleData, int flags)
[AppDomain Transition]

@j82w j82w added bug Something isn't working needs-investigation labels Nov 22, 2019
@j82w
Copy link
Contributor

j82w commented Nov 22, 2019

Is it just query or do point operations also get stuck?

@j82w j82w added the QUERY label Nov 22, 2019
@lukasz-pyrzyk
Copy link
Author

Yes. It gets stuck on point operation with ReadItemAsync and Stored Procedure execution.

Both cases got stuck on the LoadBalancingPartition.RequestAsync. Parallel stacks below

ReadItemAsync:
parallelStack

Stored Procedure:
parallelStackSp

I think the issue might be somehow related to the thread context.

My app has background thread which scans the database for idler users. It uses the same base provider, repository etc. Queries done by this background thread complete successfully, even when other web threads are hang.

When thread from System.Web touches DB for second time, it gots stuck.

@j82w
Copy link
Contributor

j82w commented Nov 22, 2019

Are you using the latest 3.4.1 .NET SDK? There was a bug in version before 3.2.0 that was fixed in PR #761 . Troubleshooting this will be a lot easier once you provide the repo.

@j82w j82w removed the QUERY label Nov 22, 2019
@lukasz-pyrzyk
Copy link
Author

Yes, I use the latest SDK. @j82w, the smallest repro is in https://github.com/lukasz-pyrzyk/CosmosDbSyncOverAsync . You need to setup environment variables to connect to the database, I've added notes to the readme.

@ealsur
Copy link
Member

ealsur commented Nov 25, 2019

Are you using IHttpHandler in the real solution or was that only for the repro?

@lukasz-pyrzyk
Copy link
Author

It's taken from the real app

@ealsur
Copy link
Member

ealsur commented Nov 25, 2019

Have you consider IHttpAsyncHandler? Since this involves an async operation?

You mention that you were using V2 SDK, could you share how were the queries being executed?

For example, What was the V2 version of :

var dbClient =  _containerFactory;
var partitionKey = new PartitionKey(clientId.ToString());
var query = dbClient
    .GetItemLinqQueryable<ClientEntity>(requestOptions: new QueryRequestOptions { PartitionKey = partitionKey })
    .Where(x => x.ClientId == clientId)
    .ToFeedIterator();

var client = await query.ExecuteFirstOrDefault(nameof(GetClient)).ConfigureAwait(false);
return client;

@ealsur
Copy link
Member

ealsur commented Nov 25, 2019

I also managed to get the same deadlock using the V2 SDK client on the Repro project, calling DocumentClient.ReadDatabaseAsync from within the HeartbeatRouteHandler locks in the same way.

@lukasz-pyrzyk
Copy link
Author

Have you consider IHttpAsyncHandler? Since this involves an async operation?

I did, unfortunately I cannot move. I'm reusing 3rd party handlers and injecting my CosmosDBRepository to them with a strictly coupled synchronous interface. That's why I did ugly sync-over-async.

You mention that you were using V2 SDK, could you share how were the queries being executed?

Sure. I've added new branch called v2 to the repository, https://github.com/lukasz-pyrzyk/CosmosDbSyncOverAsync/tree/v2

I also managed to get the same deadlock using the V2 SDK client on the Repro project, calling DocumentClient.ReadDatabaseAsync from within the HeartbeatRouteHandler locks in the same way.

Can you show me the code? In the branch, I've committed a code which works correctly for multiple requests, at least on my machine :)

@ealsur
Copy link
Member

ealsur commented Nov 25, 2019

To rule out anything about Queries, I added a method on the Repository:

public async Task<bool> CheckIfDatabaseExistsV2()
{
    try
    {
        await _oldClient.ReadDatabaseAsync(UriFactory.CreateDatabaseUri("Test"));
        return true;
    }
    catch (DocumentClientException ex) when (ex.StatusCode == HttpStatusCode.NotFound)
    {
        return false;
    }
}

The _oldClient is created on the constructor:

_oldClient = new DocumentClient(new Uri(_settings.Endpoint), _settings.Key, new ConnectionPolicy()
{
    ConnectionMode = _settings.UseDirectConnectionMode ? Microsoft.Azure.Documents.Client.ConnectionMode.Direct : Microsoft.Azure.Documents.Client.ConnectionMode.Gateway
});

Then from inside the RouteHandler, I call:

repository.CheckIfDatabaseExistsV2().ConfigureAwait(false).GetAwaiter().GetResult();

And it locks.

I switched to Gateway mode to rule out any Direct mode issues.

@ealsur
Copy link
Member

ealsur commented Nov 25, 2019

It looks like the issue is, when running async over sync, the Synchronization Context.

If I put the V2 ReadDatabaseAsync call inside the Repository class and call it with GetAwaiter from the Handler, it locks (the Repository class has an await).
But, if I call the V2 ReadDatabaseAsync call from the Handler, with GetAwaiter, it works. (leaving the Repository class out of the picture).

One easy workaround for this is to wrap any call you make around Task.Run, for example:

ClientEntity clientEntity;
Task.Run(() =>
    {
        clientEntity = repository.GetClient(clientId).ConfigureAwait(false).GetAwaiter().GetResult();
    }
).Wait();

@lukasz-pyrzyk
Copy link
Author

lukasz-pyrzyk commented Nov 25, 2019

Thanks @ealsur for digging in! The issue is definitely related to the Synchronization Context. I tried to reproduce the issue with xunit tests and it doesn't occur, only with ASP.NET which has a synchronization context.

According to your sample, I would like to point out one thing. You have a hang on V2 SDK because you don't have ConfigureAwait(false) on the async call inside repository. When you add it, the hang does not occur.
 
I've updated my samples to don't have any query-related stuff.

SDK V3, hangs:

        repository.ReadDatabase().ConfigureAwait(false).GetAwaiter().GetResult();
        [...]

        internal async Task ReadDatabase()
        {
            try
            {
                var client = await clientFactory;
                await client.ReadAsync().ConfigureAwait(false);
            }
            catch (Exception e)
            {
                Console.WriteLine(e);
                throw;
            }
        }

SDK V2, doesn't hang

        repository.ReadDatabase().ConfigureAwait(false).GetAwaiter().GetResult();
        [..]

        internal async Task ReadDatabase()
        {
            try
            {
                var client = await _clientFactory;
                await client.ReadDatabaseAsync(UriFactory.CreateDatabaseUri(_settings.DatabaseId)).ConfigureAwait(false);
            }
            catch (Exception e)
            {
                Trace.WriteLine(e);
                throw;
            }
        }```

@ealsur
Copy link
Member

ealsur commented Nov 25, 2019

Thanks for the catch!
If you wrap your call in the RouteHandler in Task.Run, does it work correctly in your real project?

@lukasz-pyrzyk
Copy link
Author

Well, in real app I have some kind of dynamic injecting and abstraction, but I quickly checked - With Task.Run it doesn't hang

@ealsur
Copy link
Member

ealsur commented Dec 5, 2019

The linked PR was tested on the repro Github and it fixes the reported issue.

@lukasz-pyrzyk
Copy link
Author

@ealsur thank you very much! If you have some kind of night builds, I will be happy to test it in the real app.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working needs-investigation
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants