Rare Terms Aggregation Performance Optimization #13122

sandeshkr419 · 2024-04-08T17:49:25Z

Unsure about existing performance of Rare Terms Aggregation at the moment, but looking through initial code at high level, it looks like that this aggregation also utilizes iterating through each document.

The idea is to utilize the terms frequency from Lucene similar to #11643 and avoid iterating through individual documents.

Next Steps:

Measure/gather existing performance of rare terms aggregation
Improve upon the implementation if it can be done with above ideation

peternied · 2024-04-10T15:22:19Z

[Triage - attendees 1 2 3 4 5 6]

This looks like a duplicate of [RFC] "Significant Terms" Aggregation Performance Ideas #13124

@sandeshkr419 Lets make these issues distinct if they need to be tracked separately, but overall idea capture around aggregation perf seems like a single topic

sandeshkr419 · 2024-04-11T18:23:35Z

Hi @peternied - keeping these issues separate since the underlying search operations, their code flows and ideas to optimize will be different. They do fall under the aggregation category and there is a probablity that these may share some optimization ideas but for now lets track each of them separately without one being influenced by the other.

sandeshkr419 added Search:Aggregations Search:Performance labels Apr 8, 2024

github-actions bot added the untriaged label Apr 8, 2024

peternied closed this as completed Apr 10, 2024

sandeshkr419 reopened this Apr 11, 2024

sandeshkr419 removed the untriaged label Apr 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rare Terms Aggregation Performance Optimization #13122

Rare Terms Aggregation Performance Optimization #13122

sandeshkr419 commented Apr 8, 2024

peternied commented Apr 10, 2024

sandeshkr419 commented Apr 11, 2024

Rare Terms Aggregation Performance Optimization #13122

Rare Terms Aggregation Performance Optimization #13122

Comments

sandeshkr419 commented Apr 8, 2024

peternied commented Apr 10, 2024

sandeshkr419 commented Apr 11, 2024