-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GraphQL] A concurrent update was performed on this collection and corrupted its state. #15731
Comments
Following. |
Probably not the same fix but same root cause. If there is a query that returns multiple objects (in our case content items), or on object has multiple children to load (content items again) then if GraphQL.NET uses async operations that are not awaited, or if it uses threads (Parallel.ForEach for instance) at that point it's possible that two concurrent accesses to the same session (the one from the scoped DI) will conflict. The solution is to find from the query how GraphQL.NET is handling these. With the previous version we had changed the default behavior which was to do thinks asynchronously to synchronously, I even filed a bug in their repos to tell them it's shouldn't be a default, too risky. Now it could just be our code that is not awaiting a task (like the fix you linked). I will try to follow the breadcrumbs from the trace you shared: |
@lampersky Is your query using content items containing or contained in a |
@sebastienros yes |
I wasn't able to repro on sqlite. But will try tomorrow. The reported problem is on SQL Server. Query looks sth like this, it is not the original one.
|
@sebastienros this is the smallest possible query, all unnecessary things were removed (ContentPickers, BagParts etc.), so we only have nested list parts
|
I've simplified resolve part to the minimum, to exclude loader problems etc, and I'm still facing the issue. It looks like ResolveLockedAsync is not enough to lock querying YesSql. when I changed to ResolveAsync and added semaphore, I can't repro the issue anymore. Other than that, using loader for list part is questionable and could be improved. It looks like all items are fetched and skip/first is applied on materialized collection. Anyway, it would be good to know your thoughts @sebastienros. |
I understand how it's fixing all the issues, but that's also wrong, because it's a static semaphore, so this is actually blocking every single call, even from different db sessions. The lock should be per session, or request. Looking at the current ResolveLockedAsync it seems better, as it's on the context, and the code is identical I believe. Maybe the graphQlContext is actually cloned, that could be checked with some logging. |
@lampersky ,I seem to be running into this problem, but I can't be sure because when I run into concurrency issues, I do this by adding MultipleActiveResultSets=True; To my connection string, and it was fixed I was surprised to see your description, I was beginning to think it was just on Sqlite |
I noticed that the ResolveLockedAsync method is not effective here. If you set a breakpoint in the LoadPublishedContentItemsForListAsync method, you can see that they are executed in parallel. I solved this problem by setting a synchronization lock in this method, but I am not sure if this is the best solution. YesSQL (5.0.0) has added an EnableThreadSafetyCheck setting. After setting it to true, it is clear that the LoadPublishedContentitemsForListAsync method is not thread safe. |
@hyzx86 sorry for late reply, I've tried to reproduce this issue on sqlite using a smaller sample of data, and that is why I wrote that I'm not able to reproduce on sqlite. I thought I will be able prepare some smaller recipe, but unfortunately it is not as easy, my recipe file has almost 20MB. So, it looks like the problem can be only observed when you have lots of items and you are querying all of them. |
I probably understand your situation, since your paging size is 300+, it's possible that the next query was requested while the current request was still pending, hence the concurrency issue. |
I think it's finally time to introduce SerialDocumentExecuter. As was discussed few years ago, new versions of graphql introduced serial execution for dataloader meaning we can discard LockedAsyncFieldResolver and solve all problems by executing in serial, as it was in beginning |
@tropcicstefan right, there is this: https://github.com/graphql-dotnet/graphql-dotnet/blob/master/src/GraphQL/Execution/DefaultExecutionStrategySelector.cs#L43 That the DocumentExecuter uses, and the default is Parallel for queries. So there must be a way top configure it to use Serial for queries, or create a custom ExecutionStrategySelector ans use it. |
(and there is always scoped scope as an option in resolver) |
I have the feeling that I already did this some years ago, changing the default execution policy to use serial for queries... So maybe this is not the problem, the Data loader may have a different one. |
maybe won't solve the issue but code would be cleaner, without locked customizations |
When using GraphiQL editor, sometimes I'm getting this exception, when querying content items with nested lists.
It's not consistently repeatable for me (approx once every five queries).
Stacktrace:
@sebastienros I've just found that you fixed similar issue 3 years ago.
The text was updated successfully, but these errors were encountered: