-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplify how source is passed to fetch subphases. #65292
Conversation
@@ -546,3 +546,46 @@ | |||
- match: {hits.hits.0.highlight.company.0: "<em>ABC</em> company"} | |||
- match: {hits.hits.1.highlight.company.0: "<em>ABC</em> <em>ABCD</em> company"} | |||
- match: {hits.hits.2.highlight.company.0: "<em>ABCD</em> company"} | |||
|
|||
--- | |||
# This test guards against a subtle edge case in the fetch phase related to source-loading and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This protects against reintroducing the bug in #31000. I should have added this test when fixing the bug originally, but better late than never!
A question that might come up: why not go further and provide source access in only one place? Maybe we could remove
My mental model is that |
Pinging @elastic/es-search (Team:Search) |
This clashes slightly with #65219 but I think the two can be reconciled. I particularly like the simplification to |
My apologies, I had meant to submit this comment earlier but it didn't go through! Would you be okay reviewing this one first, as I think it will clarify the requirements (and it's actively causing difficulties with other PRs, as in #63572)?
I think it needs to be done as part of this change. Before, the fetch phase used its own |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks @jtibshirani
This PR simplifies how the document source is passed to each fetch subphase. A summary of the strategy: * For each document, we try to eagerly load the source and store it on `HitContext`. Most subphases that access source, like source filtering and highlighting, use `HitContext`. For nested hits, we filter the parent source and also store this source on `HitContext`. * Only for non-nested documents, we also store the loaded source on `QueryShardContext#lookup`. This allows subphases that access source through `SearchLookup` to use the pre-loaded source when possible. This is now a common occurrence, since runtime fields are supported in the 'fields' option and may soon be supported in highlighting. There is no longer a special `SearchLookup` just for the fetch phase. This was not necessary and was mostly caused by a misunderstanding of how `QueryShardContext` should be used. Addresses elastic#62511.
This PR simplifies how the document source is passed to each fetch subphase. A summary of the strategy:
HitContext
. Most subphases that access source, like source filtering and highlighting, useHitContext
. For nested hits, we filter the parent source and also store this source onHitContext
.QueryShardContext#lookup
. This allows subphases that access source throughSearchLookup
to use the pre-loaded source when possible. This is now a common occurrence, since runtime fields are supported in the 'fields' option and may soon be supported in highlighting.There is no longer a special
SearchLookup
just for the fetch phase. This was not necessary and was mostly caused by a misunderstanding of howQueryShardContext
should be used.Addresses #62511.