Add partial searchable snapshot support for a frozen tier #68509

ywelsch · 2021-02-04T09:50:55Z

A frozen tier is backed by an external object store (like S3) and caches only a small portion of data on local disks. In this way, users can reduce hardware costs substantially for infrequently accessed data. For the frozen tier we only pull in the parts of the files that are actually needed to run a given search. Further, we don't require the node to have enough space to host all the files. We therefore have a cache that manages which file parts are available, and which ones not. This node-level shared cache is bounded in size (typically in relation to the disk size), and will evict items based on a LFU policy, as we expect some parts of the Lucene files to be used more frequently than other parts. The level of granularity for evictions is at the level of regions of a file, and does not require evicting full files. The on-disk representation that was chosen for the cold tier is not a good fit here, as it won't allow evicting parts of a file. Instead we are using fixed-size pre-allocated files and have implemented our own memory management logic to map regions of the shard's original Lucene files onto regions in these node-level shared files that are representing the on-disk cache.

This PR adds the core functionality to searchable snapshots to power such a frozen tier:

It adds the node-level shared cache that evicts file regions based on a LFU policy
It adds the machinery to dynamically download file regions into this cache and serve their contents when searches execute.
It extends the mount API with a new parameter, storage, which selects the kind of local storage used to accelerate
searches of the mounted index. If set to full_copy (default, used for cold tier), each node holding a shard of the searchable snapshot index makes a full copy of the shard to its local storage. If set to shared_cache, the shard uses the newly introduced shared cache, only holding a partial copy of the index on disk (used for frozen tier).

Co-authored-by: Tanguy Leroux <...>
Co-authored-by: Armin Braun <...>
Co-authored-by: David Turner <...>

This adds all the necessary infrastructure to use the reusable, single-file cache in practice: * Create cache file in a data directory instead of a temp directory * Fully pre-allocate it (the existing solution would at least on Linux still do a sparse allocation) * Manage file channel resource by ref counting * Add minimal abstraction in place of exposing `FileChannel` to allow for adjusting the concrete paging approach under the hood in a follow-up

This commit introduces a new flag, `?partial_local_copy`, indicating that the local copy of a searchable snapshot should be partial rather than complete, enabling the frozen tier functionality.

DaveCTurner

LGTM

My overnight test run encountered no failures at all.

original-brownbear

LGTM

tlrx

LGTM. I only managed to have ~100 runs, all successful so far. I wonder if we can enable partial caching on some more tests, I'll look at it but this is not a reason to block this PR.

A frozen tier is backed by an external object store (like S3) and caches only a small portion of data on local disks. In this way, users can reduce hardware costs substantially for infrequently accessed data. For the frozen tier we only pull in the parts of the files that are actually needed to run a given search. Further, we don't require the node to have enough space to host all the files. We therefore have a cache that manages which file parts are available, and which ones not. This node-level shared cache is bounded in size (typically in relation to the disk size), and will evict items based on a LFU policy, as we expect some parts of the Lucene files to be used more frequently than other parts. The level of granularity for evictions is at the level of regions of a file, and does not require evicting full files. The on-disk representation that was chosen for the cold tier is not a good fit here, as it won't allow evicting parts of a file. Instead we are using fixed-size pre-allocated files and have implemented our own memory management logic to map regions of the shard's original Lucene files onto regions in these node-level shared files that are representing the on-disk cache. This PR adds the core functionality to searchable snapshots to power such a frozen tier: - It adds the node-level shared cache that evicts file regions based on a LFU policy - It adds the machinery to dynamically download file regions into this cache and serve their contents when searches execute. - It extends the mount API with a new parameter, `storage`, which selects the kind of local storage used to accelerate searches of the mounted index. If set to `full_copy` (default, used for cold tier), each node holding a shard of the searchable snapshot index makes a full copy of the shard to its local storage. If set to `shared_cache`, the shard uses the newly introduced shared cache, only holding a partial copy of the index on disk (used for frozen tier). Co-authored-by: Tanguy Leroux <tlrx.dev@gmail.com> Co-authored-by: Armin Braun <me@obrown.io> Co-authored-by: David Turner <david.turner@elastic.co>

This commit adds support for the recently introduced partial searchable snapshot (elastic#68509) to ILM. Searchable snapshot ILM actions may now be specified with a `storage` option, specifying either `full_copy` or `shared_cache` (similar to the "mount" API) to mount either a full or partial searchable snapshot: ```json PUT _ilm/policy/my_policy { "policy": { "phases": { "cold": { "actions": { "searchable_snapshot" : { "snapshot_repository" : "backing_repo", "storage": "shared_cache" } } } } } } ``` Internally, If more than one searchable snapshot action is specified (for example, a full searchable snapshot in the "cold" phase and a partial searchable snapshot in the "frozen" phase) ILM will re-use the existing snapshot when doing the second mount since a second snapshot is not required. Currently this is allowed for actions that use the same repository, however, multiple `searchable_snapshot` actions for the same index that use different repositories is not allowed (the ERROR state is entered). We plan to allow this in the future in subsequent work. If the `storage` option is not specified in the `searchable_snapshot` action, the mount type defaults to "shared_cache" in the frozen phase and "full_copy" in all other phases. Relates to elastic#68605

…68762) This commit adds support for the recently introduced partial searchable snapshot (#68509) to ILM. Searchable snapshot ILM actions may now be specified with a `storage` option, specifying either `full_copy` or `shared_cache` (similar to the "mount" API) to mount either a full or partial searchable snapshot: `json PUT _ilm/policy/my_policy { "policy": { "phases": { "cold": { "actions": { "searchable_snapshot" : { "snapshot_repository" : "backing_repo", "storage": "shared_cache" } } } } } } ` Internally, If more than one searchable snapshot action is specified (for example, a full searchable snapshot in the "cold" phase and a partial searchable snapshot in the "frozen" phase) ILM will re-use the existing snapshot when doing the second mount since a second snapshot is not required. Currently this is allowed for actions that use the same repository, however, multiple `searchable_snapshot` actions for the same index that use different repositories is not allowed (the ERROR state is entered). We plan to allow this in the future in subsequent work. If the `storage` option is not specified in the `searchable_snapshot` action, the mount type defaults to "shared_cache" in the frozen phase and "full_copy" in all other phases. Relates to #68605

Reenables BWC for the searchable snapshot usage stats test. Relates #68509

Reenables BWC for the searchable snapshot usage stats test. Relates elastic#68509

Reenables BWC for the searchable snapshot usage stats test. Relates #68509

ywelsch and others added 29 commits January 25, 2021 11:08

Frozen prototype

448fc4b

get something to work with eviction

56183cb

Make it LFU

815788e

restructure

c112b58

Separate test for the new mode

b18cfda

ceremony

8984fd9

listener instead of future

dfcddcf

Merge remote-tracking branch 'elastic/master' into frozen-proto

3889cd1

Merge remote-tracking branch 'elastic/master' into frozen-proto

8a562a4

spotless

4c9a8f6

rename

659df8c

Add FrozenIndexInputTests

0be517c

Add frozen cache range/recovery size settings + more randomization

3d4a9bf

Larger file content in FrozenIndexInputTests

0b814fd

fix writes to blob cache

a52e57c

Improve test and other fixes

81de850

Add basic tests for FrozenCacheService

1a431b1

Add force-eviction methods

b68afb3

Test decay logic

6d3c365

Add log info when shared cache file is created

f3c7923

More tests

a72d9d8

Merge branch 'master' into frozen-proto

fd11fbc

Fix up license headers

9f319c3

Fix SearchableSnapshotDirectoryStatsTests

cb6e939

Merge branch 'master' into frozen-proto

1a03d24

setting rename

847a9d0

Add Javadocs and move some methods to base class

9fb9309

Add ?storage arg to mount API (#68431)

6a34532

This commit introduces a new flag, `?partial_local_copy`, indicating that the local copy of a searchable snapshot should be partial rather than complete, enabling the frozen tier functionality.

ywelsch added the release highlight label Feb 4, 2021

tlrx added 3 commits February 4, 2021 15:33

Remove EnvironmentTests.newEnvironment(Settings) now it's in ESTestCase

5aaacf3

Format StepListener.thenCombine() and add tests for it

3b9489d

Fix checkstyle

12a0639

DaveCTurner approved these changes Feb 5, 2021

View reviewed changes

original-brownbear approved these changes Feb 5, 2021

View reviewed changes

tlrx approved these changes Feb 5, 2021

View reviewed changes

ywelsch merged commit 50f4a0b into master Feb 5, 2021

ywelsch deleted the frozen-proto branch February 5, 2021 08:15

ywelsch added a commit that referenced this pull request Feb 5, 2021

Adjust version constant after backport (#68509)

366befc

dakrone mentioned this pull request Feb 8, 2021

Add support for partial searchable snapshots to ILM #68714

Merged

williamrandolph added the >feature label Feb 18, 2021

stevejgordon mentioned this pull request Feb 22, 2021

7.12.0 Meta Ticket elastic/elasticsearch-net#5337

Closed

34 tasks

ywelsch mentioned this pull request Mar 16, 2021

Reenable BWC test for searchable snapshots usage stats #70439

Merged

ywelsch added a commit that referenced this pull request Mar 16, 2021

Reenable BWC test for searchable snapshots usage stats (#70439)

c145610

Reenables BWC for the searchable snapshot usage stats test. Relates #68509

ywelsch added a commit to ywelsch/elasticsearch that referenced this pull request Mar 16, 2021

Reenable BWC test for searchable snapshots usage stats (elastic#70439)

b9b1f38

Reenables BWC for the searchable snapshot usage stats test. Relates elastic#68509

ywelsch mentioned this pull request Mar 16, 2021

Reenable BWC test for searchable snapshots usage stats (#70439) #70443

Merged

ywelsch added a commit that referenced this pull request Mar 16, 2021

Reenable BWC test for searchable snapshots usage stats (#70439) (#70443)

a64ac66

Reenables BWC for the searchable snapshot usage stats test. Relates #68509

ywelsch added a commit that referenced this pull request Mar 16, 2021

Reenable BWC test for searchable snapshots usage stats (#70439) (#70443)

0f4ab47

Reenables BWC for the searchable snapshot usage stats test. Relates #68509

jrodewig mentioned this pull request Mar 23, 2021

[DOCS] Add 7.12 release highlights #70703

Merged

jakelandis added v8.0.0-alpha1 and removed v8.0.0 labels Jul 26, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add partial searchable snapshot support for a frozen tier #68509

Add partial searchable snapshot support for a frozen tier #68509

ywelsch commented Feb 4, 2021 •

edited

Loading

DaveCTurner left a comment

original-brownbear left a comment

tlrx left a comment

Add partial searchable snapshot support for a frozen tier #68509

Add partial searchable snapshot support for a frozen tier #68509

Conversation

ywelsch commented Feb 4, 2021 • edited Loading

DaveCTurner left a comment

Choose a reason for hiding this comment

original-brownbear left a comment

Choose a reason for hiding this comment

tlrx left a comment

Choose a reason for hiding this comment

ywelsch commented Feb 4, 2021 •

edited

Loading