Fix ElasticSearch error with filter deletion #584
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
On a filter deletion with a lot of things to delete, ES returns a "All shards have failed" error.
That's because we are asking ElasticSearch to return all the equipment infos corresponding to deleted identifiables ids. If we have a lot of equipments deleted, the following list:
deletedEquipments.stream().map(EquipmentInfosToDelete::id).toList(),
can contain more than 10000 elements.
This equipment infos list is then used to avoid sending delete requests to ES for equipments that don't exist in the index. This is not necessary as ElasticSearch can handle this directly. In terms of performance, it's more efficient to send delete request with documents that don't exist instead of fetching the documents and deleting only the existing ones from my tests.
I added another constant for deletion partitioning as the existing one is too big for deletions. If there are too many ids in the request, the number of clauses is too large and can't be handled by ElasticSearch. The error looks like:
on my local deployment.
From the documentation, in the version we use there is a heuristic to determine the maxClauseCount from the available memory/heap (https://www.elastic.co/guide/en/elasticsearch/reference/current/search-settings.html).
I set it by default to 2048 to be below the limit in our local deployment but we could override this value in other environment if we're looking to improve performance of delete with a lot of values.