Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make searchable snapshots cache persistent #66275

Merged
merged 2 commits into from
Dec 14, 2020

Conversation

tlrx
Copy link
Member

@tlrx tlrx commented Dec 14, 2020

The searchable snapshots cache implemented in 7.10 is
not persisted across node restarts, forcing data nodes to
download files from the snapshot repository again once
the node is restarted.

This commit introduces a new Lucene index that is used
to store information about cache files. The information
about cache files are periodically updated and committed
in this index as part of the cache synchronization task
added in #64696. When the data node starts the Lucene
index is used to load in memory the cache files information;
these information are then used to repopulate the searchable
snapshots cache with the cache files that exist on disk.

Since data nodes can have one or more data paths, this
change introduces a Lucene index per data path. Information
about cache files are updated in the Lucene index located
on the same data path of the cache files.

Backport of #65725 for 7.11

The searchable snapshots cache implemented in 7.10 is
not persisted across node restarts, forcing data nodes to
download files from the snapshot repository again once
the node is restarted.

This commit introduces a new Lucene index that is used
to store information about cache files. The information
about cache files are periodically updated and committed
in this index as part of the cache synchronization task
added in elastic#64696. When the data node starts the Lucene
index is used to load in memory the cache files information;
these information are then used to repopulate the searchable
snapshots cache with the cache files that exist on disk.

Since data nodes can have one or more data paths, this
change introduces a Lucene index per data path. Information
about cache files are updated in the Lucene index located
on the same data path of the cache files.
@tlrx tlrx added >enhancement :Distributed/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs backport v7.11.0 labels Dec 14, 2020
@elasticmachine elasticmachine added the Team:Distributed Meta label for distributed team label Dec 14, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

@tlrx tlrx merged commit d2bd9db into elastic:7.x Dec 14, 2020
@tlrx tlrx deleted the add-persistent-cache-7.x branch December 14, 2020 18:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport :Distributed/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs >enhancement Team:Distributed Meta label for distributed team v7.11.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants