-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
EZP-30296: Large API subtree delete leads to serious Solr index inconsistencies #150
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change to deleteAllItemsWithoutAdditionalLocation()
seems good. The one one re indexing subtree seems like it needs a bit further work, maybe split it into separate PR/Issue depending on approach you choose to be able to move fast on delete issue.
Side: This should ideally be applied to earlier branches as well, even 1.0 if it applies (in case of delete change at least), but at least 1.5.
@@ -15,7 +15,7 @@ parameters: | |||
ezpublish.search.solr.slot.move_user_group.class: eZ\Publish\Core\Search\Common\Slot\MoveUserGroup | |||
ezpublish.search.solr.slot.delete_user_group.class: eZ\Publish\Core\Search\Common\Slot\DeleteUserGroup | |||
ezpublish.search.solr.slot.copy_subtree.class: eZ\Publish\Core\Search\Common\Slot\CopySubtree | |||
ezpublish.search.solr.slot.move_subtree.class: eZ\Publish\Core\Search\Common\Slot\MoveSubtree | |||
ezpublish.search.solr.slot.move_subtree.class: EzSystems\EzPlatformSolrSearchEngine\Slot\MoveSubtree |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I'm a bit torn here
- bulkIndexContent should be added to a SPI interface and handler here and others that can should implement it. then we can update kernel's AbstractSubtree::indexSubtree() to take advantage.
If we don't do that, we'll need to overload all subtree slots here (missing is hide and unhide at least from what I could see now in 6.13 branch).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed on Slack, I'll do it in a separated PR(s).
caee711
to
bafb81b
Compare
62c21ae
to
04e6bd2
Compare
It seems that my PR's tests are failing in the same places as the one done by @mateuszbieniek (#149). I'll try to fix these tests, as I would love to have Travis green before merge. |
yes please 😁 |
+1 (missing backports from 1.6/1.7 maybe, which fails with test on kernel 7.5? if so maybe solution is to stop supporting 7.5.x on 1.5 branch 🤷♂ and make it 1.x only) Regarding the PHP 5.6 failure on composer running out of memory, I think QA might have worked quite a bit on solving this other places cc @mnocon |
I can't help you with the failiing tests, but I have a solution for memory issues 😉 See #151 |
I'm almost a 100% sure that there failing tests are related to a missing commit, the one from this PR: #133 to be more precise. I've temporarily pushed it here to check whether it indeed solves the problem. Unfortunately, I can't be like this, since |
@andrerom last build (https://travis-ci.org/ezsystems/ezplatform-solr-search-engine/jobs/589992661) confirms my suspicion. How would you like to proceed? Personally, I would drop the possibility to install Edit: Ok, it's green now! @andrerom / @alongosz / @mnocon / @adamwojs / @lserwatka are you ok with the change I did in: 18c9c5e? It will be applied to |
6ec6021
to
18c9c5e
Compare
@@ -12,7 +12,7 @@ | |||
], | |||
"require": { | |||
"php": "~5.6|~7.0", | |||
"ezsystems/ezpublish-kernel": "~6.7.10@dev|^6.13.6@dev|~7.3.5@dev|^7.4.3@dev", | |||
"ezsystems/ezpublish-kernel": "~6.7.10@dev|^6.13.6@dev|~7.3.5@dev|~7.4.3@dev", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm ok with this, so +1 once PR is green ;)
TODO: Drop this line on merge up.
@@ -328,7 +334,7 @@ public function commit($flush = false) | |||
*/ | |||
protected function deleteAllItemsWithoutAdditionalLocation($locationId) | |||
{ | |||
$query = $this->prepareQuery(); | |||
$query = $this->prepareQuery(self::SOLR_MAX_QUERY_LIMIT); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I get why you wanted really high limit now..
I think unfortunately you'll have to change logic to iterate on this if you get back this amount of hits, or you might get total count which will tell you (roguthly*) how many iterations you need to make right?
* In the freak case of parallel async processes for some reason adding locations to this tree at the same time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think unfortunately you'll have to change logic to iterate on this if you get back this amount of hits, or you might get total count which will tell you how many iterations you need to make right?
It was my first idea, but @adamwojs was really against this approach since it might be expensive, performance wise, to perform two same queries, first to count and second to load locations. WDYT @andrerom?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But that said, that can be follow also. You definitely improve the logic here as previous limit was 1000...
So ignore this for now unless you feel up for it in this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AS said above, this is already way better then what is there today. So I'm fine with this as is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@micszo since it's green now, could you please test it again? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
QA Approved on eZ Platform EE v2.5.5 with diff (w/o 18c9c5e).
@kmadejski you can merge it up but please remember about dropping line from composer.json |
@kmadejski make sure you merge to 1.6, 1.7, 2.0 and master (3.0) |
@lserwatka done in e66f9e8 / 205a256 / f95f0f3 / 2533029 |
JIRA ticket: https://jira.ez.no/browse/EZP-30296
Bulk deleting using
deleteByQuery
has been introduced. I couldn't find particular information about Solr's bulk operation size limit, at least not in their documentation, but I think it's good to have some limit to prevent exceeding GET length limit for instance.After a talk with @andrerom and @alongosz I decided to move further work on slots to a separated PR. Therefore, this one covers only bulk deleting.