Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Leverage the Lucene's Matches API in a new highlighter type #34015

Closed
jimczi opened this issue Sep 24, 2018 · 9 comments
Closed

Leverage the Lucene's Matches API in a new highlighter type #34015

jimczi opened this issue Sep 24, 2018 · 9 comments
Labels
>feature :Search Relevance/Highlighting How a query matched a document Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch

Comments

@jimczi
Copy link
Contributor

jimczi commented Sep 24, 2018

We should leverage the Matches API capabilities to build a new Highlighter type. This new highlighter would:

  • Highlight the text part that matches any boolean query accurately.
  • Not split phrase query into individual terms.
  • Accept matches_fields natively.
@jimczi jimczi added >feature :Search Relevance/Highlighting How a query matched a document labels Sep 24, 2018
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search-aggs

@mgrishaber-merlin
Copy link

The current highlighters are far too slow and tend to bloat the index.
I have a request for a new highlighter that returns a simple word list as part of the json, rather than returning marked up html text for each hit. So if I get 1000 hits for a query, I would expect to get back a single word list that contains all of the words that matched rather than 1000 html snippets.
Something like the following,
"hits": {
"total": 1000,
"max_score": null,
"hitlist": [
{
"ski",
"skiier",
"skiing",
"skiied",
"skiis"
]
}
"hits": [
{
"_index": "myindex",
"_type": "mytype",
"_id": "1",
"_score": 2.7186205,
"sort": [
1438971741000,
2.7186205,
.
.
.
"_index": "myindex",
"_type": "mytype",
"_id": "1000",
"_score": 2.7186205,
"sort": [
1438971741000,
2.7186205
]
}
}

Thanks
Mike

@jimczi
Copy link
Contributor Author

jimczi commented Oct 19, 2018

that's a different thing @mgrishaber-merlin and probably more related to #34214 which a proposal to extend named queries to return the terms that match.

@oersted
Copy link

oersted commented Dec 6, 2018

How is progress on this? More accurate highlights and not splitting phrase matches are important properties for our use case.

@StijnArnauts
Copy link

Has there been any progress on this?

@rjernst rjernst added the Team:Search Meta label for search team label May 4, 2020
@Daantie
Copy link

Daantie commented Jul 14, 2020

I see a new label is added on May 5th. Can you give us an update of the progress on this?

@stefanobranco
Copy link

Any progress? This is a pretty big concern for us, so if there's any way for us to support this feel free to let us know.

Also, is it correct that this would also allow intervals queries to be highlighted correctly?

@OFeshchenko
Copy link

Is there any progress on this?

@mayya-sharipova
Copy link
Contributor

mayya-sharipova commented Mar 14, 2024

@OFeshchenko and others. Reporting the progress, although we did not implement a new highlighter, from Elasticsearch 8.10 by default unified highlighter uses Lucene Matches API that allows to highlight queries accurately.

Closing this, as we have decided not to implement a new highlighter, but rather focus on enhancing unified highlighter.

@javanna javanna added Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch and removed Team:Search Meta label for search team labels Jul 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>feature :Search Relevance/Highlighting How a query matched a document Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch
Projects
None yet
Development

No branches or pull requests