-
Notifications
You must be signed in to change notification settings - Fork 24.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Retry point in time on other copy when possible #66713
Conversation
e27af7e
to
c793251
Compare
e4a8396
to
46d7d02
Compare
Pinging @elastic/es-search (Team:Search) |
|
||
// TODO: Remove this constructor | ||
public ShardSearchContextId(String sessionId, long id) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will remove this constructor in a follow-up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the approach.
I left one comment regarding the deletion of the PIT.
We should lookup all active replicas when removing the contexts.
I am also concerned by the fact that we always keep the latest failure for each shard.
Now that we add replicas in the context, we should ensure that we keep the original exception when we retry on a replica. Otherwise any exception could be hidden by the search context missing exception that the other replicas would send. Does that makes sense ?
server/src/main/java/org/elasticsearch/action/search/TransportSearchAction.java
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/action/search/TransportSearchAction.java
Show resolved
Hide resolved
@jimczi Thanks for reviewing.
I think we handle this in https://github.com/elastic/elasticsearch/pull/66713/files#diff-5198131d51dc75f4b8b82d7cc9a3960b0955bbd50c7b6573bdc5a227f02d12cfR213 and https://github.com/elastic/elasticsearch/pull/66713/files#diff-c547f7891956be1109085c486b76f18044af4b96551b46ed3aba5cf82c044158R103
I think it's implemented here: |
Yep sorry I should have been more precise. If we delete the extra context after each search, we loose the protection that we set when the context is opened on the shard. Maybe that's not important for searchable snapshots so the workaround that you implemented is ok but I am not sure :(.
Nice, sorry I missed it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been thinking more about the options we have to take replicas into account and I think your solution is simpler and less intrusive. We can revisit in the future but that sounds enough for now especially knowing that it's a solution for frozen indices and searchable snapshots only. So +1 to merge this solution, sorry for the back and forth.
I indeed explored two options before choosing the current approach. I found the other option more compelling than the current one as it truly replaces unavailable contexts with the new ones. However, it's not easy to clean up those new contexts properly. I will merge this PR as is, and we can revisit this as you said. Thank you for the feedback. |
Relates #61062