-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IBX-704: Added aggregation support to NodeFactory #1792
IBX-704: Added aggregation support to NodeFactory #1792
Conversation
@webhdx would be great for reviewing this next week. |
ping @ezsystems/engineering-team |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not too experienced with SOLR/Elastic usage so it needs attention from Adam or Andrew. Anyway I don't quite like the general idea here. NodeFactory becomes more and more complicated/bloated with all performance improvements. Would be nice to provide alternative NodeFactory which could utilize different handling when ES/SOLR is being used. But I understand this would require more refactoring and providing some extension points. I put some remarks in the comments, please consider them.
@@ -33,6 +34,7 @@ final class NodeFactory | |||
'DatePublished' => SortClause\DatePublished::class, | |||
'ContentName' => SortClause\ContentName::class, | |||
]; | |||
private const MAX_AGGREGATED_LOCATION_IDS = 100; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be configurable if there is possibility to fine tune performance on different setups.
* | ||
* @throws \eZ\Publish\API\Repository\Exceptions\InvalidArgumentException | ||
*/ | ||
private function supplyChildrenCount(Node $node, array $aggregationResult = null): void |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be cleaner if you used empty array instead of null
value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Passing null
here indicates that Aggregation results are not available (for example: using a legacy search engine) and the "legacy" way of counting children should be used inside the supplyChildrenCount
method. An empty array is a totally valid result of aggregation. I can change it to pass to an empty array, but we will have to introduce something like $isAggregationSupported
and pass it alongside to supplyChildrenCount
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be cleaner if you used empty array instead of
null
value.
Passing
null
here indicates that Aggregation results are not available (for example: using a legacy search engine) and the "legacy" way of counting children should be used inside thesupplyChildrenCount
method. An empty array is a totally valid result of aggregation. I can change it to pass to an empty array, but we will have to introduce something like$isAggregationSupported
and pass it alongside tosupplyChildrenCount
@mateuszbieniek I don't see how distinguishing between []
and null
would change the behavior of this method in its current state. Could you show a snippet with []
instead of null
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are two possibilities:
-
Aggregations are not supported (
$aggregationResult
isnull
). Every single container node will have$this->countSubitems($node->locationId);
called -
Aggregations are supported and
$aggregationResult
is an array.
$totalCount = isset($aggregationResult[$node->locationId]) ?
$aggregationResult[$node->locationId] :
0;
If $aggregationResult
has key $node->locationId
then node has children, if a key does not exist, this means that there children count is 0
;
So, if I pass it as empty array, when aggregations are not supported current state of method will result on 0 children count for all nodes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can change it to an empty array but will have to add a check for aggregation support inside supplyChildrenCount
method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've changed the check inside from supplyChildrenCount
:
if ($aggregationResult) {
to
if ($aggregationResult !== null) {
This should reduce the confusion.
* | ||
* @throws \eZ\Publish\API\Repository\Exceptions\InvalidArgumentException | ||
*/ | ||
private function supplyChildrenCount(Node $node, array $aggregationResult = null): void |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be cleaner if you used empty array instead of
null
value.
Passing
null
here indicates that Aggregation results are not available (for example: using a legacy search engine) and the "legacy" way of counting children should be used inside thesupplyChildrenCount
method. An empty array is a totally valid result of aggregation. I can change it to pass to an empty array, but we will have to introduce something like$isAggregationSupported
and pass it alongside tosupplyChildrenCount
@mateuszbieniek I don't see how distinguishing between []
and null
would change the behavior of this method in its current state. Could you show a snippet with []
instead of null
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please share some basic numbers in PR description as this is an optimisation patch?
Those are false positives - SonarCloud messes something up... |
Co-authored-by: Adam Wójs <adam@wojs.pl>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the change to the method behavior it's clear why you need it there. With test coverage we could avoid this discussion, because you would spot a mistake immediately.
One final request related to forward compatibility of rebranding:
…y.max_location_ids_in_single_aggregation
Kudos, SonarCloud Quality Gate passed! |
You can merge it up now. |
* Added aggregation support to NodeFactory * Changed to use LocationChildrenTermAggregation * LocationChildrenTermAggregation moved to different namespace in kernel * Changes after CR * CS Fix * Changes after CR#3 * CS fix * Fixups after CR * Update src/lib/UI/Module/ContentTree/NodeFactory.php Co-authored-by: Adam Wójs <adam@wojs.pl> * Changed the parameter name to ibexa.admin_ui.content_tree.node_factory.max_location_ids_in_single_aggregation Co-authored-by: Adam Wójs <adam@wojs.pl> (cherry picked from commit 64b5a09)
Jira's issue: https://issues.ibexa.co/browse/IBX-704
requires:
ezsystems/ezplatform-kernel#215
ezsystems/ezplatform-solr-search-engine#215
Children counts call to Solr can create an overhead for
load-subtree
calls.This PR adds Aggregation support for
NodeFactory
to speed it up when supported.