Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions docs/reference/ilm/apis/slm-api.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ well as a separate API for immediately invoking a snapshot based on a policy.
Since SLM falls under the same category as ILM, it is stopped and started by
using the <<start-stop-ilm,start and stop>> ILM APIs.

[[slm-api-put]]
=== Put Snapshot Lifecycle Policy API

Creates or updates a snapshot policy. If the policy already exists, the version
Expand Down Expand Up @@ -95,6 +96,7 @@ The top-level keys that the policy supports are described below:
To update an existing policy, simply use the put snapshot lifecycle policy API
with the same policy id as an existing policy.

[[slm-api-get]]
=== Get Snapshot Lifecycle Policy API

Once a policy is in place, you can retrieve one or more of the policies using
Expand Down Expand Up @@ -155,6 +157,7 @@ GET /_slm/policy
// CONSOLE
// TEST[continued]

[[slm-api-execute]]
=== Execute Snapshot Lifecycle Policy API

Sometimes it can be useful to immediately execute a snapshot based on policy,
Expand Down Expand Up @@ -325,6 +328,7 @@ Which now includes the successful snapshot information:

It is a good idea to test policies using the execute API to ensure they work.

[[slm-api-delete]]
=== Delete Snapshot Lifecycle Policy API

A policy can be deleted by issuing a delete request with the policy id. Note
Expand Down
169 changes: 169 additions & 0 deletions docs/reference/ilm/getting-started-slm.asciidoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,169 @@
[role="xpack"]
[testenv="basic"]
[[getting-started-snapshot-lifecycle-management]]
== Getting started with snapshot lifecycle management

Let's get started with snapshot lifecycle management (SLM) by working through a
hands-on scenario. The goal of this example is to automatically back up {es}
indices using the <<modules-snapshots,snapshots>> every day at a particular
time.

[float]
[[slm-gs-create-policy]]
=== Setting up a repository

Before we can set up an SLM policy, we'll need to set up a
<<snapshots-repositories,snapshot repository>> where the snapshots will be
stored. Repositories can use {plugins}/repository.html[many different backends],
including cloud storage providers. You'll probably want to use one of these in
production, but for this example we'll use a shared file system repository:

[source,js]
-----------------------------------
PUT /_snapshot/my_repository
{
"type": "fs",
"settings": {
"location": "my_backup_location"
}
}
-----------------------------------
// CONSOLE
// TEST

[float]
=== Setting up a policy

Now that we have a repository in place, we can create a policy to automatically
take snapshots. Policies are written in JSON and will define when to take
snapshots, what the snapshots should be named, and which indices should be
included, among other things. We'll use the <<slm-api-put,Put Policy>> API
to create the policy.

[source,js]
--------------------------------------------------
PUT /_slm/policy/nightly-snapshots
{
"schedule": "0 30 1 * * ?", <1>
"name": "<nightly-snap-{now/d}>", <2>
"repository": "my_repository", <3>
"config": { <4>
"indices": ["*"] <5>
}
}
--------------------------------------------------
// CONSOLE
// TEST[continued]
<1> when the snapshot should be taken, using
{xpack-ref}/trigger-schedule.html#schedule-cron[Cron syntax], in this
case at 1:30AM each day
<2> whe name each snapshot should be given, using
<<date-math-index-names,date math>> to include the current date in the name
of the snapshot
<3> the repository the snapshot should be stored in
<4> the configuration to be used for the snapshot requests (see below)
<5> which indices should be included in the snapshot, in this case, every index

This policy will take a snapshot of every index each day at 1:30AM UTC.
Snapshots are incremental, allowing frequent snapshots to be stored efficiently,
so don't be afraid to configure a policy to take frequent snapshots.

In addition to specifying the indices that should be included in the snapshot,
the `config` field can be used to customize other aspects of the snapshot. You
can use any option allowed in <<snapshots-take-snapshot,a regular snapshot
request>>, so you can specify, for example, whether the snapshot should fail in
special cases, such as if one of the specified indices cannot be found.

[float]
=== Making sure the policy works

While snapshots taken by SLM policies can be viewed through the standard snapshot
API, SLM also keeps track of policy successes and failures in ways that are a bit
easier to use to make sure the policy is working. Once a policy has executed at
least once, when you view the policy using the <<slm-api-get,Get Policy API>>,
some metadata will be returned indicating whether the snapshot was sucessfully
initiated or not.

Instead of waiting for our policy to run, let's tell SLM to take a snapshot
as using the configuration from our policy right now instead of waiting for
1:30AM.

[source,js]
--------------------------------------------------
PUT /_slm/policy/nightly-snapshots/_execute
--------------------------------------------------
// CONSOLE
// TEST[skip:we can't easily handle snapshots from docs tests]

This request will kick off a snapshot for our policy right now, regardless of
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps include what the output of the execute API is here? Up to you.

the schedule in the policy. This is useful for taking snapshots before making
a configuration change, upgrading, or for our purposes, making sure our policy
is going to work successfully. The policy will continue to run on its configured
schedule after this execution of the policy.

[source,js]
--------------------------------------------------
GET /_slm/policy/nightly-snapshots?human
--------------------------------------------------
// CONSOLE
// TEST[continued]

This request will return a response that includes the policy, as well as
information about the last time the policy succeeded and failed, as well as the
next time the policy will be executed.

[source,js]
--------------------------------------------------
{
"nightly-snapshots" : {
"version": 1,
"modified_date": "2019-04-23T01:30:00.000Z",
"modified_date_millis": 1556048137314,
"policy" : {
"schedule": "0 30 1 * * ?",
"name": "<nightly-snap-{now/d}>",
"repository": "my_repository",
"config": {
"indices": ["*"],
}
},
"last_success": { <1>
"snapshot_name": "nightly-snap-2019.04.24-tmtnyjtrsxkhbrrdcgg18a", <2>
"time_string": "2019-04-24T16:43:49.316Z",
"time": 1556124229316
} ,
"last_failure": { <3>
"snapshot_name": "nightly-snap-2019.04.02-lohisb5ith2n8hxacaq3mw",
"time_string": "2019-04-02T01:30:00.000Z",
"time": 1556042030000,
"details": "{\"type\":\"index_not_found_exception\",\"reason\":\"no such index [important]\",\"resource.type\":\"index_or_alias\",\"resource.id\":\"important\",\"index_uuid\":\"_na_\",\"index\":\"important\",\"stack_trace\":\"[important] IndexNotFoundException[no such index [important]]\\n\\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver$WildcardExpressionResolver.indexNotFoundException(IndexNameExpressionResolver.java:762)\\n\\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver$WildcardExpressionResolver.innerResolve(IndexNameExpressionResolver.java:714)\\n\\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver$WildcardExpressionResolver.resolve(IndexNameExpressionResolver.java:670)\\n\\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.concreteIndices(IndexNameExpressionResolver.java:163)\\n\\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.concreteIndexNames(IndexNameExpressionResolver.java:142)\\n\\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.concreteIndexNames(IndexNameExpressionResolver.java:102)\\n\\tat org.elasticsearch.snapshots.SnapshotsService$1.execute(SnapshotsService.java:280)\\n\\tat org.elasticsearch.cluster.ClusterStateUpdateTask.execute(ClusterStateUpdateTask.java:47)\\n\\tat org.elasticsearch.cluster.service.MasterService.executeTasks(MasterService.java:687)\\n\\tat org.elasticsearch.cluster.service.MasterService.calculateTaskOutputs(MasterService.java:310)\\n\\tat org.elasticsearch.cluster.service.MasterService.runTasks(MasterService.java:210)\\n\\tat org.elasticsearch.cluster.service.MasterService$Batcher.run(MasterService.java:142)\\n\\tat org.elasticsearch.cluster.service.TaskBatcher.runIfNotProcessed(TaskBatcher.java:150)\\n\\tat org.elasticsearch.cluster.service.TaskBatcher$BatchedTask.run(TaskBatcher.java:188)\\n\\tat org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:688)\\n\\tat org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:252)\\n\\tat org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:215)\\n\\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)\\n\\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)\\n\\tat java.base/java.lang.Thread.run(Thread.java:834)\\n\"}"
} ,
"next_execution": "2019-04-24T01:30:00.000Z", <4>
"next_execution_millis": 1556048160000
}
}
--------------------------------------------------
// TESTRESPONSE[skip:the presence of last_failure and last_success is asynchronous and will be present for users, but is untestable]
<1> information about the last time the policy successfully initated a snapshot
<2> the name of the snapshot that was successfully initiated
<3> information about the last time the policy failed to initiate a snapshot
<4> the is the next time the policy will execute

NOTE: This metadata only indicates whether the request to initiate the snapshot was
made successfully or not - after the snapshot has been successfully started, it
is possible for the snapshot to fail if, for example, the connection to a remote
repository is lost while copying files.

If you're following along, the returned SLM policy shouldn't have a `last_failure`
field - it's included above only as an example. You should, however, see a
`last_success` field and a snapshot name. If you do, you've successfully taken
your first snapshot using SLM!

While only the most recent sucess and failure are available through the Get Policy
API, all policy executions are recorded to a history index, which may be queried
by searching the index pattern `.slm-history*`.

That's it! We have our first SLM policy set up to periodically take snapshots
so that our backups are always up to date. You can read more details in the
<<snapshot-lifecycle-management-api,SLM API documentation>> and the
<<modules-snapshots,general snapshot documentation.>>
4 changes: 4 additions & 0 deletions docs/reference/ilm/index.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,8 @@ separate APIs for managing snapshot lifecycles. Please see the
<<snapshot-lifecycle-management-api,Snapshot Lifecycle Management>>
documentation for information about configuring snapshots.

See <<getting-started-snapshot-lifecycle-management,getting started with SLM>>.

[IMPORTANT]
===========================
{ilm} does not support mixed-version cluster usage. Although it
Expand All @@ -81,3 +83,5 @@ include::error-handling.asciidoc[]
include::ilm-and-snapshots.asciidoc[]

include::start-stop-ilm.asciidoc[]

include::getting-started-slm.asciidoc[]
2 changes: 2 additions & 0 deletions docs/reference/modules/snapshots.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,7 @@ recommend testing the reindex from remote process with a subset of your data to
understand the time requirements before proceeding.

[float]
[[snapshots-repositories]]
=== Repositories

You must register a snapshot repository before you can perform snapshot and
Expand Down Expand Up @@ -322,6 +323,7 @@ POST /_snapshot/my_unverified_backup/_verify
It returns a list of nodes where repository was successfully verified or an error message if verification process failed.

[float]
[[snapshots-take-snapshot]]
=== Snapshot

A repository can contain multiple snapshots of the same cluster. Snapshots are identified by unique names within the
Expand Down