Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fill in missing subclasses when loading ClassMetadata #10411

Merged
merged 3 commits into from
Jan 17, 2023

Conversation

mpdude
Copy link
Contributor

@mpdude mpdude commented Jan 16, 2023

This PR suggests that for inheritance hierarchies, abstract @Entity classes shall be identified automatically, so that they do not have to be declared by users in their @DiscriminatorMap configuration.

Background

For entities and mapped superclasses that are part of an STI or JTI inheritance hierarchy, ClassMetadata::$subClasses contains a list of all entities that are a subclass of the current class. This field is of crucial importance since it is used to enumerate all entity classes and thus tables that need to be dealt with when loading or persisting data, creating the schema etc.

When this list is incomplete, possibly subtle or weird bugs like #9145 or #8127 may occur.

It is not possible in PHP to obtain a list of all existing subclasses for a given class. Also, our mapping drivers cannot provide that list, at least not in a general way and at acceptable cost. For that reason, the discriminator map (DM) at the root entity is used to "discover" all the entity (sub)classes that may need to be persisted in an inheritance tree.

Situation before/without this PR

Runtime validation of the discriminator map allows abstract @Entity classes to be omitted from it. Also, the error message given when a class missing from the DM is detected refers to abstract as an alternative:

public static function mappedClassNotPartOfDiscriminatorMap($className, $rootClassName)
{
return new self(
"Entity '" . $className . "' has to be part of the discriminator map of '" . $rootClassName . "' " .
"to be properly mapped in the inheritance hierarchy. Alternatively you can make '" . $className . "' an abstract class " .
'to avoid this exception from occurring.'
);
}

From the users point of view, there is no point in listing abstract entity classes and assigning a discriminator value for them, when no such entities can be created and persisted in practice. When defining the mapping, the primary concern is to specify discriminator values for the entity classes being used, not to declare a full class hierarchy.

I suspect there may be other bugs lurking in the context of the feature from #1257, where an event listener was added to re-map interfaces to classes in the DM at runtime. When the entity class provided at runtime inherits from another abstract entity, this feature falls short as this extra class cannot be provided.

Suggested change

After ClassMetadata has been loaded for the root entity class in a STI/JTI hierarchy, fill in missing subClasses automatically.

This does not amend, change or autogenerate the DM in any way. It only takes the entity classes given in the DM as starting pointers, works upwards through the class hierarchy from there on and identifies and adds abstract @Entity classes to the subClasses field.

Implementation remarks

There's a challenge since we need to identify all classes that are abstract and @Entity (especially not @MappedSuperclass). That would be easy if we had all the class metadata already loaded.

However, metadata loading happens through doctrine/persistence and goes from root classes towards child classes. So, we will always need to process "higher up" classes first and do not have child class mapping information available at that time.

It also seems unsafe to update parent class metadata at a later time when child class information has been loaded, given that the ClassMetadata is distributed with the loadClassMetadata event and is put into the metadata cache.

So, my choice was to put the initialization at the end of doLoadMetadata: That way, we can run it at the end of loading metadata for the root entity.

  • No need to update/"fixup" the root entity metadata later on
  • Child classes can inherit the subClass field data, keeping only the classes relevant for them (→ we need just one "discovery" run for the root class)

The initialization happens after the loadClassMetadata event (at the risk of having a possibly incomplete subclass list at the time it is run), to allow the mechanism from #1257 to kick in and provide classes before we do the resolution.

Since we need to exclude mapped superclasses from the subClass list, I saw no other option than to make the driver do an extra load request. This, however, only has to happen for classes that we know are abstract (remember, non-abstract ones must be specified in the DM), and the resulting information will be cached with the root entity's metadata.

The driver thing maybe can be improved if drivers get a "shortcut" method to identify mapped superclasses, without doing all the other extra loading. Might be possible to add this through an add-on interface implemented by drivers.

Tests

For a start, this PR includes test cases that demonstrate #6558, #9145 and #8127 would be fixed without having to provide complete (including abstract entities) DMs.

Ideas for other reasonable test cases are welcome!

Related issues

Closes #10389, closes #10387, closes #10388, fixes #8127, includes #9145, includes #6578, fixes #6558.

A follow-up fix for an edge case is in #10557.

A warning is given in this comment that there may still be issues when event listeners are used to provide or change mapping information, which is currently not supported in the new peekIfIsMappedSuperclass() method.

@mpdude mpdude changed the title discover missing subclasses Fill in missing subclasses when loading ClassMetadata Jan 16, 2023
@mpdude mpdude force-pushed the discover-missing-subclasses branch 2 times, most recently from e3ea237 to 4d280b0 Compare January 16, 2023 17:54
@mpdude mpdude force-pushed the discover-missing-subclasses branch from 4d280b0 to 4e8e3ef Compare January 16, 2023 20:33
greg0ire
greg0ire previously approved these changes Jan 16, 2023
Copy link
Member

@greg0ire greg0ire left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks really good, my only qualm is about the remaining TODO, hopefully somebody has a solution for this.

Copy link
Member

@beberlei beberlei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very clever.

Just to clarify, the peekIfIsMappedSuperclass method is only necessary if the discriminator map is "incomplete"?

Should we also add a validation of discriminator map completeness to the schema validator?

@mpdude
Copy link
Contributor Author

mpdude commented Jan 16, 2023

@beberlei

Just to clarify, the peekIfIsMappedSuperclass method is only necessary if the discriminator map is "incomplete"?

Yes.

After a root entity class metadata has been loaded, take all the entity classes declared in the DM as starting points. Work upwards from there on, using the driver's isTransient method to find only non-transient classes. For those, use reflection to check whether the class is abstract. Only for such candidates, have an extra driver request to figure out if it's a MappedSuperclass or not.

Looking at the AnnotationDriver, isTransient has to inspect annotations already, so a dedicated method might bring the final (conclusive) result at the same cost; that's an optimization I'd like to keep for a follow-up PR.

Should we also add a validation of discriminator map completeness to the schema validator?

It's there already, and it is stricter (abstract @Entity must be declared) than the runtime validation. #9096 and #9142 have the discussion.

IMHO, if we get this PR here merged, we should relax the validator by re-applying #9096.

/** @param class-string $className */
private function peekIfIsMappedSuperclass(string $className): bool
{
// TODO can we shortcut this for classes that have already been loaded? can that be the case at all?
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about this a bit more... Class metadata is loaded from root classes downwards, and so we find root entities/DMs before the other classes have their metadata loaded.

So, I'd say there is no better/cheaper place where we might look for the information at that time...?

Comment on lines +390 to +394
$reflService = $this->getReflectionService();
$class = $this->newClassMetadataInstance($className);
$this->initializeReflection($class, $reflService);

$this->driver->loadMetadataForClass($className, $class);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if all that is really necessary for what we're trying to achieve here... But it's probably a safe default to start with, and may go away with a bit more sophisticated driver support.

@mpdude
Copy link
Contributor Author

mpdude commented Jan 16, 2023

@greg0ire Thought about the TODO a little, maybe there's nothing that needs to be done. See comment above.

@greg0ire greg0ire added this to the 2.14.2 milestone Jan 17, 2023
@greg0ire greg0ire merged commit 1090dbe into doctrine:2.14.x Jan 17, 2023
@greg0ire
Copy link
Member

Thanks @mpdude !

@mpdude mpdude deleted the discover-missing-subclasses branch January 17, 2023 18:33
mpdude added a commit to mpdude/doctrine2 that referenced this pull request Jan 17, 2023
Since doctrine#10411 has been merged, no more need to specify all intermediate abstract entity classes.

Thus, we can relax the schema validator check as requested in doctrine#9095. The reasons given in doctrine#9142 no longer apply.
mpdude added a commit to mpdude/doctrine2 that referenced this pull request Jan 18, 2023
This removes the unnecessary "middle2" class and puts back in the effect of the data provider.

Both changes were accidentally committed when I was working on doctrine#10411 and just meant as experiments during debugging.
@mpdude mpdude mentioned this pull request Jan 18, 2023
derrabus pushed a commit that referenced this pull request Jan 18, 2023
* Fixup GH8127 test case

This removes the unnecessary "middle2" class and puts back in the effect of the data provider.

Both changes were accidentally committed when I was working on #10411 and just meant as experiments during debugging.

* Fix CS
greg0ire pushed a commit to mpdude/doctrine2 that referenced this pull request Feb 5, 2023
Since doctrine#10411 has been merged, no more need to specify all intermediate
abstract entity classes.

Thus, we can relax the schema validator check as requested in doctrine#9095. The
reasons given in doctrine#9142 no longer apply.

Fixes doctrine#9095
Closes doctrine#9096
greg0ire pushed a commit to mpdude/doctrine2 that referenced this pull request Feb 5, 2023
Since doctrine#10411 has been merged, no more need to specify all intermediate
abstract entity classes.

Thus, we can relax the schema validator check as requested in doctrine#9095. The
reasons given in doctrine#9142 no longer apply.

Fixes doctrine#9095
Closes doctrine#9096
greg0ire pushed a commit to mpdude/doctrine2 that referenced this pull request Feb 5, 2023
Since doctrine#10411 has been merged, no more need to specify all intermediate
abstract entity classes.

Thus, we can relax the schema validator check as requested in doctrine#9095. The
reasons given in doctrine#9142 no longer apply.

Fixes doctrine#9095
mpdude added a commit to mpdude/doctrine2 that referenced this pull request Mar 2, 2023
…ss()` method

[This comment](doctrine#10473 (comment)) hints to a case where the `ClassMetadataFactory::peekIfIsMappedSuperclass()` method introduced in doctrine#10411 causes a failure.

`CMF::peekIfIsMappedSuperclass()` has to perform improvised metadata loading in a situation where the CMF is currently loading a class. So, we cannot use the full/real `ClassMetadataFactory` mechanisms, since it would require a re-entry for a subclass of the current class, causing an infinite loop (loads parent classes first, and that's what we're currently doing).

The problem is that the improvised call to `$driver->loadMetadataForClass()` cannot provide a pre-filled `ClassMetadata` instance populated with all parent class fields and associations. But, when attribute or association overrides are used, a check is made to see if the overridden field/association actually exists, and this information is missing in that situation.

This PR suggests to override the methods to get around this. In fact, we do not care about all these details, we only want to ask the driver if the class is a mapped superclass or not.

A much better fix would be to have a dedicated method on the driver to ask it just that particular question (also better performance-wise). But I do not see how we could get that done in a BC way – ideas? 💡

A few things that need to come together to make the bug surface:

* Load an entity declaring an inheritance tree
* There must be a mapped superclass in the inheritance tree to provide the field that shall be overriden
* An entity class must inherit from the mapped superclass and override the field
* That entity class must be an abstract, intermediate class not be declared in the discriminator map so we can "discover" it
* The overriden property must be private so the mapping drivers (using reflection) do not see it when looking at the overriding entity class.
mpdude added a commit to mpdude/doctrine2 that referenced this pull request Mar 8, 2023
…ss()` method

[This comment](doctrine#10473 (comment)) hints to a case where the `ClassMetadataFactory::peekIfIsMappedSuperclass()` method introduced in doctrine#10411 causes a failure.

`CMF::peekIfIsMappedSuperclass()` has to perform improvised metadata loading in a situation where the CMF is currently loading a class. So, we cannot use the full/real `ClassMetadataFactory` mechanisms, since it would require a re-entry for a subclass of the current class, causing an infinite loop (loads parent classes first, and that's what we're currently doing).

The problem is that the improvised call to `$driver->loadMetadataForClass()` cannot provide a pre-filled `ClassMetadata` instance populated with all parent class fields and associations. But, when attribute or association overrides are used, a check is made to see if the overridden field/association actually exists, and this information is missing in that situation.

This PR suggests to override the methods to get around this. In fact, we do not care about all these details, we only want to ask the driver if the class is a mapped superclass or not.

A much better fix would be to have a dedicated method on the driver to ask it just that particular question (also better performance-wise). But I do not see how we could get that done in a BC way – ideas? 💡

A few things that need to come together to make the bug surface:

* Load an entity declaring an inheritance tree
* There must be a mapped superclass in the inheritance tree to provide the field that shall be overriden
* An entity class must inherit from the mapped superclass and override the field
* That entity class must be an abstract, intermediate class not be declared in the discriminator map so we can "discover" it
* The overriden property must be private so the mapping drivers (using reflection) do not see it when looking at the overriding entity class.
mpdude added a commit to mpdude/doctrine2 that referenced this pull request Mar 8, 2023
…ss()` method

[This comment](doctrine#10473 (comment)) hints to a case where the `ClassMetadataFactory::peekIfIsMappedSuperclass()` method introduced in doctrine#10411 causes a failure.

`CMF::peekIfIsMappedSuperclass()` has to perform improvised metadata loading in a situation where the CMF is currently loading a class. So, we cannot use the full/real `ClassMetadataFactory` mechanisms, since it would require a re-entry for a subclass of the current class, causing an infinite loop (loads parent classes first, and that's what we're currently doing).

The problem is that the improvised call to `$driver->loadMetadataForClass()` cannot provide a pre-filled `ClassMetadata` instance populated with all parent class fields and associations. But, when attribute or association overrides are used, a check is made to see if the overridden field/association actually exists, and this information is missing in that situation.

This PR suggests to override the methods to get around this. In fact, we do not care about all these details, we only want to ask the driver if the class is a mapped superclass or not.

A much better fix would be to have a dedicated method on the driver to ask it just that particular question (also better performance-wise). But I do not see how we could get that done in a BC way – ideas? 💡

A few things that need to come together to make the bug surface:

* Load an entity declaring an inheritance tree
* There must be a mapped superclass in the inheritance tree to provide the field that shall be overriden
* An entity class must inherit from the mapped superclass and override the field
* That entity class must be an abstract, intermediate class not be declared in the discriminator map so we can "discover" it
* The overriden property must be private so the mapping drivers (using reflection) do not see it when looking at the overriding entity class.
mpdude added a commit to mpdude/doctrine2 that referenced this pull request Mar 8, 2023
…lassMetadataFactory::peekIfIsMappedSuperclass()` method introduced in doctrine#10411 causes a failure.

`CMF::peekIfIsMappedSuperclass()` has to perform improvised metadata loading in a situation where the CMF is currently loading a class. So, we cannot use the full/real `ClassMetadataFactory` mechanisms, since it would require a re-entry for a subclass of the current class, causing an infinite loop (loads parent classes first, and that's what we're currently doing).

The problem is that the improvised call to `$driver->loadMetadataForClass()` cannot provide a pre-filled `ClassMetadata` instance populated with all parent class fields and associations. But, when attribute or association overrides are used, a check is made to see if the overridden field/association actually exists, and this information is missing in that situation.

This PR suggests to override the methods to get around this. In fact, we do not care about all these details, we only want to ask the driver if the class is a mapped superclass or not.

A much better fix would be to have a dedicated method on the driver to ask it just that particular question (also better performance-wise). But I do not see how we could get that done in a BC way – ideas? 💡

A few things that need to come together to make the bug surface:

* Load an entity declaring an inheritance tree
* There must be a mapped superclass in the inheritance tree to provide the field that shall be overriden
* An entity class must inherit from the mapped superclass and override the field
* That entity class must be an abstract, intermediate class not be declared in the discriminator map so we can "discover" it
* The overriden property must be private so the mapping drivers (using reflection) do not see it when looking at the overriding entity class.
mpdude added a commit to mpdude/doctrine2 that referenced this pull request Mar 8, 2023
…lassMetadataFactory::peekIfIsMappedSuperclass()` method introduced in doctrine#10411 causes a failure.

`CMF::peekIfIsMappedSuperclass()` has to perform improvised metadata loading in a situation where the CMF is currently loading a class. So, we cannot use the full/real `ClassMetadataFactory` mechanisms, since it would require a re-entry for a subclass of the current class, causing an infinite loop (loads parent classes first, and that's what we're currently doing).

The problem is that the improvised call to `$driver->loadMetadataForClass()` cannot provide a pre-filled `ClassMetadata` instance populated with all parent class fields and associations. But, when attribute or association overrides are used, a check is made to see if the overridden field/association actually exists, and this information is missing in that situation.

This PR suggests to override the methods to get around this. In fact, we do not care about all these details, we only want to ask the driver if the class is a mapped superclass or not.

A much better fix would be to have a dedicated method on the driver to ask it just that particular question (also better performance-wise). But I do not see how we could get that done in a BC way – ideas? 💡

A few things that need to come together to make the bug surface:

* Load an entity declaring an inheritance tree
* There must be a mapped superclass in the inheritance tree to provide the field that shall be overriden
* An entity class must inherit from the mapped superclass and override the field
* That entity class must be an abstract, intermediate class not be declared in the discriminator map so we can "discover" it
* The overriden property must be private so the mapping drivers (using reflection) do not see it when looking at the overriding entity class.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants