-
Notifications
You must be signed in to change notification settings - Fork 134
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Batch updating resources with a resource template may throw a UniqueConstraintViolationException #1690
Comments
Well I'm somewhat at a loss for this one... It obviously seems to be related to the "detach" system we have in place for batch edits to keep them from going wild in terms of memory usage and poor speed, but I don't quite get how it's happening here specifically... In particular: removing a call that detaches "new" entities after the flush that throws this error eliminates the error. But I can't come up with a mechanism for how that later detach is affecting the earlier flush. |
OK so I think I have a better handle on why the error happens here, because the batch edit actually performs several "batch edit" API calls in a row, so it's just that our detach on the first one is affecting the flush of the next one. That makes sense. The problem here seems to be: we're detaching everything "new," but the templates themselves are already loaded before we start the batch edit. When you access the template properties from within the batch edit, they get stored in the template object. When we go to detach the "new" stuff, that means we detach the template properties (which were not previously loaded) but not the templates. The result is that the templates are still in the Doctrine system, and are holding references to all their properties, which have been detached, so Doctrine sees them as "new" properties and tries to insert them, giving us this error. The problem at least makes sense but I don't know that I have a solid answer on how to fix it. The detaching is necessary to have reasonable performance and memory usage for large updates (otherwise every item, value, etc. that is touched during an update accumulates in memory). I don't think we really have a way to "reach into" every template and reset its list of properties, at least not in any reasonable way. Doctrine is moving to deprecate its "detach" method and this kind of mess is essentially the reason why. One option is: just skip detaching resource template properties specifically, to avoid this specific bug only. That will probably work, actually, but it's a little "dirtier" of a fix than I'd really prefer. I may end up having to go with this one anyway, though: it's almost certainly the simplest solution by far. We could also just skip the detaching process altogether in "synchronous" contexts like this (a non-background batch edit): it's really there for long-running background processes. I believe the templates are loaded here due to something else using/loading them earlier during the request, a complication that doesn't occur for a background job. |
Thank you for looking at it. Apparently it is possible to cascade a merge operation (https://www.doctrine-project.org/projects/doctrine-orm/en/2.6/reference/working-with-associations.html#transitive-persistence-cascade-operations). So instead of detaching entities we could clear the entity manager after the flush, then reattach "old" entities using cascading merge, so that all associated entities are re-attached as well. I never used that though, so I may be completely wrong. |
Clear-then-merge probably won't help here: the issue is having the Template in the set of "old" things and not its properties. Regardless of the method used, the same kind of problem is going to occur on subsequent flushes. Cascade merge could account for some of that, but it's likely to bring in too much and blunt the point of the clear/detach in the first place. Maybe it could be done in a targeted enough way that it would work, though. Another issue, I believe I'm remembering correctly that Doctrine is moving away from merge as well. |
The detach process used in batch edit/create is pretty brittle, especially in "synchronous" or in-process contexts where many more entities are loaded before the batch edit begins, so the results are less predictable. Typical "bad" results are Doctrine errors about duplicates or encountering "new" un-persisted entities through a relationship. The detach also serves very little purpose in this context, as it's only there to improve performance _between_ multiple batches, and the in-process edits only do a single batch. This commit adds a new request option for batch edits and creates, "detachEntity", defaulted to true, and has the controllers for the in-process batch updates set it to false. (#1690)
The strategy I'm going to go with for now is just eliminating the detach process from these kinds of batch edits, which should resolve this specific issue. |
The detach process used in batch edit/create is pretty brittle, especially in "synchronous" or in-process contexts where many more entities are loaded before the batch edit begins, so the results are less predictable. Typical "bad" results are Doctrine errors about duplicates or encountering "new" un-persisted entities through a relationship. The detach also serves very little purpose in this context, as it's only there to improve performance _between_ multiple batches, and the in-process edits only do a single batch. This commit adds a new request option for batch edits and creates, "detachEntity", defaulted to true, and has the controllers for the in-process batch updates set it to false. (omeka#1690) (cherry picked from commit 578dd20)
Thanks. It works perfectly, and it applies nicely to 2.1, which is even more perfect :) |
I think we'll call this one fixed, then. |
The detach process used in batch edit/create is pretty brittle, especially in "synchronous" or in-process contexts where many more entities are loaded before the batch edit begins, so the results are less predictable. Typical "bad" results are Doctrine errors about duplicates or encountering "new" un-persisted entities through a relationship. The detach also serves very little purpose in this context, as it's only there to improve performance _between_ multiple batches, and the in-process edits only do a single batch. This commit adds a new request option for batch edits and creates, "detachEntity", defaulted to true, and has the controllers for the in-process batch updates set it to false. (#1690) (cherry picked from commit 578dd20)
Hi,
I found a weird bug in the batch update process. Here are the steps to reproduce:
/admin/item
, select the new item and start the batch update process using the "edit selected" optionYou should see the following error message:
The code:
It seems that using the representation causes problems, while using the entity directly does not (which is the weird part of the bug :))
The bug exists at least on 2.1.2, 3.0.1, and
develop
branchThe text was updated successfully, but these errors were encountered: