-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Doctrine NOT throwing an exception when attempting to re-persist a detached entity; instead creating duplicate record. #8007
Comments
Unlikely adapting ORM to docs here: this behaviour has been expected/stable for a long time, and detached entities are not recognizable anyway. Docs will need adjustments. |
@Ocramius So if you're saying that this behavior (simply creating a duplicate record, despite the existing one already having an ID assigned to it, with no exception or notice or anything else) is expected... ...is there any way to not make it the expected behavior? Honestly, I would greatly prefer the documented behavior (an exception being thrown). If you have a situation where a routine task is re-persisting detached entities, you can quickly go from 2 to 4 to 8 to hundreds of those duplicated entities, resulting in a great deal of damage to have to clean up. It should not be this easy to write something that's as straightforward code-wise as my example loop, and yet have it be subject to an error condition that could cause such huge (and hard to figure out) problems. As I mentioned, because clearing the EM for specific entities is being deprecated in 3.0 in favor of clearing the entire memory only increases the likelihood that this kind of error will be run into. In fact, it's exactly because of this deprecation that I switched to the batch-utils function in the first place, which also resulted in me switching from a much more selective EM clear to clearing all EM-managed records. The combination of the two (forcing the EM |
Nope. Even if there was, docs follow behaviour, not the way around: changing observed behaviour would be a major BC break |
@Ocramius Okay...how about in later versions? i.e. 2.8.x or 3.0.x? I appreciate brevity in many cases, but I feel like "Nope" doesn't do much to address the notion that what I'm describing is an extremely easy trap for a developer (even one who's used Doctrine for years, as I have) to fall into, that it is reasonable to hope that an ORM would at least warn about this (if not stopping it altogether), and that the likelihood of this is increased the more users are encouraged to clear the entire EntityManager rather than specific entity types, as will soon be the case. |
It is not possible to track detached entities: they are detached, unknown to the ORM. In
orm/lib/Doctrine/ORM/UnitOfWork.php Lines 1632 to 1635 in e770913
|
Re-opening this: it is indeed requiring documentation adjustments |
@Ocramius The message there says the DBAL should be throwing an exception instead...so why isn't it in my example? The record it's trying to persist already has an assigned ID so presumably it's trying to execute an insert with that ID already explicitly set, which should be failing somewhere, even if not in the ORM. |
Not sure tbh - docs may as well be unclear there. I suggest digging in the test suite, to see if this behavior has been already checked there. We can probably drop that |
@Ocramius Just ran it again while having it echo out the SQL queries it's actually executing...for the INSERT query it's stripping out the ID entirely as a set value (it's basically just executing a standard In other words, nothing in this stack could've protected me from this situation today. With all due respect, I'm not sure how this is acceptable. |
It may indeed not be OK from your point of view, but this behavior is currently being relied upon: $thing = new Thing();
$em->persist($thing);
$em->flush();
$em->clear();
$em->persist($thing);
$em->flush();
$em->clear();
self::assertCount($em->getRepository(Thing::class)->findAll()); Changing that is indeed a BC break. |
@Ocramius I understand that it is considered a BC break, but clearly many other changes, including changes that relate to this, are intended for the EntityManager in 2.8.x and 3.0. Semver suggests that these releases (especially 3.x) can include BC breaks. Considering that if the EntityManager itself doesn't detect this condition, its removal of auto-assigned IDs will ensure that neither the DBAL nor DB itself can alert for it...does that not suggest that the ORM is the only place where a change could be made at this point to help protect against this happening where it's really not expected? Again, even an option to warn on this condition instead of just letting it happen silently would be greatly preferred. |
As mentioned above:
An un-managed entity that is persisted is assumed to be in |
@Ocramius How is it not possible to detect that the primary key identifier for the entity is already set and has a non-null value on the object being persisted? It appears this was discussed something like 9 years ago, and this exact same suggestion came up and was shot down for reasons that are unclear. Yes, it would require reflection, but the entire persistence mechanism already reflects on the other properties of the object. In fact, to know to strip that ID from the query it already has to know what those identifiers are to exclude them from the properties being set, so what is the difficulty in checking that value by reflection to see if it's set? |
Hello ! Would it be possible to generate a warning if persisting entity, which id should be automatically generated but is yet assigned ? It would resolve the situation and being a warning it's not a BC break. |
An entity with an assigned PK is the default case for all entities I work with (since I also don't use generated identifiers) 🤷♀️ @DGCarramona a warning is a BC break, as discussed in doctrine/coding-standard#150 |
@Ocramius Okay, but let's say that you do use generated identifiers. This is specified in the metadata for that primary key, that it has a GeneratedValue strategy. If you're persisting an entity that has a GeneratedValue identifier, and that property already has a value, then you can safely assume you're playing with a detached entity that previously had that ID generated by the entity manager. At least, safely enough to warn on that exact condition. What's more, it's specifically because these entities are using GeneratedValue identifiers that their ID was stripped out before the DBAL/DB layer (as the ORM is assuming it's going to get a value back for last inserted ID, not send one) that this is an otherwise unprotected situation. |
That's an interesting use-case, indeed I hacked together a test that verifies that generated identifiers are ignored during persistence: <?php
declare(strict_types=1);
namespace Doctrine\Tests\ORM\Functional\Ticket;
use Doctrine\ORM\Annotation as ORM;
use Doctrine\Tests\OrmFunctionalTestCase;
use function random_int;
/**
* @group GH-8007
*/
class GH8007Test extends OrmFunctionalTestCase
{
protected function setUp() : void
{
parent::setUp();
$this->schemaTool->createSchema([$this->em->getClassMetadata(GH8007EntityWithAutoIncrementId::class)]);
}
/**
* Verifies that joined subclasses can contain non-ORM properties.
*/
public function testPersistingAnEntityWithGeneratedIdentifierThatIsGivenAnIdentifierLeadsToAPersistedRecord()
{
$id = random_int(1000, 100000);
$entity = new GH8007EntityWithAutoIncrementId($id);
$this->em->persist($entity);
$this->em->flush();
$this->em->clear();
self::assertNotNull($this->em->find(GH8007EntityWithAutoIncrementId::class, 1));
self::assertNull($this->em->find(GH8007EntityWithAutoIncrementId::class, $id));
}
}
/** @ORM\Entity */
class GH8007EntityWithAutoIncrementId
{
/**
* @ORM\Id()
* @ORM\GeneratedValue(strategy="AUTO")
* @ORM\Column(type="integer")
*/
public $id;
public function __construct(int $id)
{
$this->id = $id;
}
} Guess not using them (and will likely continue avoiding them :P ) has shadowed this obscure behavior |
@SlvrEagle23 what database and Id generator are you using? For MySQL with auto increment I believe we get an exception here from flush as documented. You must be using Postgresql with a Sequence/Serial? |
@beberlei MariaDB 10.2 with Auto-increment, but as mentioned in the above descriptions, the fact that the entity identifier is set as an auto-generated identifier seems to be making the ORM strip it from the subsequent |
@SlvrEagle23 you mention a discussion 9 years ago, can you link it or did i just not see it here? |
@beberlei @Ocramius Hello all, following up on this issue since I notice it's still open and doesn't appear to have any resolution to date. I'm preparing my applications to be able to migrate more easily to Doctrine 3.0, and part of that is removing As a result of these changes, there are now far more instances in my code where one function clears the global EntityManager and other functionality isn't aware of this happening, and doesn't know that it needs to "re-fetch" every entity that it's currently playing with. Because there is no warning or other notice when attempting to persist an entity that already has a generated ID, as in the examples above, this means that any time I understand the reasoning behind removing entity-specific flushes and clears, but it appears that there was no attention paid to the impact this would have on applications when clearing the entire EntityManager becomes the only way of achieving that particular optimization, and there is no documentation anywhere that suggests a way to make this less painful or more intuitive for developers. |
Basically, the tl;dr here is that this comment in this line of code appears to be incorrect in the case of AUTO GeneratedValues: orm/lib/Doctrine/ORM/UnitOfWork.php Lines 1630 to 1634 in 2a98a98
The line where it assumes the state is As far as I can gather as an outside developer, though, the Basically, I could fix this on my end with a single change in a single line of code (specified above) and yet as a consequence of the library's design, I can't, so really I have no option but to continue to follow up here and ask for help. |
@SlvrEagle23 as workaround EntityManager can be decorated with this class: <?php
namespace App\Orm;
use Doctrine\ORM\ORMInvalidArgumentException;
class EntityManagerDecorator extends \Doctrine\ORM\Decorator\EntityManagerDecorator
{
/**
* preventing double rows, the id might change if detached object is persisted
* @see https://github.com/doctrine/orm/issues/8007
*/
public function persist($object) {
if( is_callable([$object,'getId'])){
$oldId = $object->getId();
$this->wrapped->persist($object);
if( !is_null($oldId) && $oldId !== $object->getId()){
throw ORMInvalidArgumentException::detachedEntityCannot($object, 'persisted - id changed by Doctrine');
}
} else {
// not an id based entity
$this->wrapped->persist($object);
}
}
} Adding to symfony configuration: services:
App\Orm\EntityManagerDecorator:
decorates: Doctrine\ORM\EntityManagerInterface |
Bug Report
Summary
After implementing the
doctrine-batch-utils
iterator to process records in one section of my application, I'm discovering that theem->clear()
called during that process detaches entities across the system. If those entities are then re-persisted, they are treated as new entities and given new IDs in the database. There is no exception thrown indicating that these are detached entities, or indicating that this problem might happen. The end result is the unintentional cloning of one record x number of times, where x is the number of times this portion of code has run.The current documentation explicitly states that if you are attempting to persist an entity that is detached, that an exception will be thrown preventing this operation.
My experiences have shown that this exception is not thrown, and that instead a new record is created. The disagreement between the actual behavior and the documentation has caused me an immense deal of unnecessary pain today. The decision on Doctrine's part to deprecate entity-specific
clear()
calls in favor of a system-wide clear only makes this situation that much more likely to happen in the future.Current behavior
When attempting to persist an entity that was made detached by an earlier call to
em->clear()
, rather than throwing an exception upon the nextflush()
call, it allows it and treats it as if it's a new record being added to the database, giving it a new auto-assigned ID but otherwise being a complete duplicate.How to reproduce
Here's an example snippet of a very easy way that I have elicited this problem in my application. It's this exact kind of logic loop that led to the problem in question:
The issue is completely resolved by reloading the
$station
variable at the start of eachforeach
(i.e.$station = $em->find(Station::class, $station->getId())
), but the application currently offers no warning whatsoever that, because the loop variable has become detached, it's about to be recreated as a new entity instead.Expected behavior
As suggested in the documentation, this should instead trigger an exception which can easily be handled by the developer.
The text was updated successfully, but these errors were encountered: