diff --git a/docs/reference/ilm/apis/slm-api.asciidoc b/docs/reference/ilm/apis/slm-api.asciidoc index 7a2924bd4a21d..a27297593e9f5 100644 --- a/docs/reference/ilm/apis/slm-api.asciidoc +++ b/docs/reference/ilm/apis/slm-api.asciidoc @@ -16,6 +16,7 @@ well as a separate API for immediately invoking a snapshot based on a policy. Since SLM falls under the same category as ILM, it is stopped and started by using the <> ILM APIs. +[[slm-api-put]] === Put Snapshot Lifecycle Policy API Creates or updates a snapshot policy. If the policy already exists, the version @@ -95,6 +96,7 @@ The top-level keys that the policy supports are described below: To update an existing policy, simply use the put snapshot lifecycle policy API with the same policy id as an existing policy. +[[slm-api-get]] === Get Snapshot Lifecycle Policy API Once a policy is in place, you can retrieve one or more of the policies using @@ -155,6 +157,7 @@ GET /_slm/policy // CONSOLE // TEST[continued] +[[slm-api-execute]] === Execute Snapshot Lifecycle Policy API Sometimes it can be useful to immediately execute a snapshot based on policy, @@ -325,6 +328,7 @@ Which now includes the successful snapshot information: It is a good idea to test policies using the execute API to ensure they work. +[[slm-api-delete]] === Delete Snapshot Lifecycle Policy API A policy can be deleted by issuing a delete request with the policy id. Note diff --git a/docs/reference/ilm/getting-started-slm.asciidoc b/docs/reference/ilm/getting-started-slm.asciidoc new file mode 100644 index 0000000000000..d76164de56fc6 --- /dev/null +++ b/docs/reference/ilm/getting-started-slm.asciidoc @@ -0,0 +1,169 @@ +[role="xpack"] +[testenv="basic"] +[[getting-started-snapshot-lifecycle-management]] +== Getting started with snapshot lifecycle management + +Let's get started with snapshot lifecycle management (SLM) by working through a +hands-on scenario. The goal of this example is to automatically back up {es} +indices using the <> every day at a particular +time. + +[float] +[[slm-gs-create-policy]] +=== Setting up a repository + +Before we can set up an SLM policy, we'll need to set up a +<> where the snapshots will be +stored. Repositories can use {plugins}/repository.html[many different backends], +including cloud storage providers. You'll probably want to use one of these in +production, but for this example we'll use a shared file system repository: + +[source,js] +----------------------------------- +PUT /_snapshot/my_repository +{ + "type": "fs", + "settings": { + "location": "my_backup_location" + } +} +----------------------------------- +// CONSOLE +// TEST + +[float] +=== Setting up a policy + +Now that we have a repository in place, we can create a policy to automatically +take snapshots. Policies are written in JSON and will define when to take +snapshots, what the snapshots should be named, and which indices should be +included, among other things. We'll use the <> API +to create the policy. + +[source,js] +-------------------------------------------------- +PUT /_slm/policy/nightly-snapshots +{ + "schedule": "0 30 1 * * ?", <1> + "name": "", <2> + "repository": "my_repository", <3> + "config": { <4> + "indices": ["*"] <5> + } +} +-------------------------------------------------- +// CONSOLE +// TEST[continued] +<1> when the snapshot should be taken, using + {xpack-ref}/trigger-schedule.html#schedule-cron[Cron syntax], in this + case at 1:30AM each day +<2> whe name each snapshot should be given, using + <> to include the current date in the name + of the snapshot +<3> the repository the snapshot should be stored in +<4> the configuration to be used for the snapshot requests (see below) +<5> which indices should be included in the snapshot, in this case, every index + +This policy will take a snapshot of every index each day at 1:30AM UTC. +Snapshots are incremental, allowing frequent snapshots to be stored efficiently, +so don't be afraid to configure a policy to take frequent snapshots. + +In addition to specifying the indices that should be included in the snapshot, +the `config` field can be used to customize other aspects of the snapshot. You +can use any option allowed in <>, so you can specify, for example, whether the snapshot should fail in +special cases, such as if one of the specified indices cannot be found. + +[float] +=== Making sure the policy works + +While snapshots taken by SLM policies can be viewed through the standard snapshot +API, SLM also keeps track of policy successes and failures in ways that are a bit +easier to use to make sure the policy is working. Once a policy has executed at +least once, when you view the policy using the <>, +some metadata will be returned indicating whether the snapshot was sucessfully +initiated or not. + +Instead of waiting for our policy to run, let's tell SLM to take a snapshot +as using the configuration from our policy right now instead of waiting for +1:30AM. + +[source,js] +-------------------------------------------------- +PUT /_slm/policy/nightly-snapshots/_execute +-------------------------------------------------- +// CONSOLE +// TEST[skip:we can't easily handle snapshots from docs tests] + +This request will kick off a snapshot for our policy right now, regardless of +the schedule in the policy. This is useful for taking snapshots before making +a configuration change, upgrading, or for our purposes, making sure our policy +is going to work successfully. The policy will continue to run on its configured +schedule after this execution of the policy. + +[source,js] +-------------------------------------------------- +GET /_slm/policy/nightly-snapshots?human +-------------------------------------------------- +// CONSOLE +// TEST[continued] + +This request will return a response that includes the policy, as well as +information about the last time the policy succeeded and failed, as well as the +next time the policy will be executed. + +[source,js] +-------------------------------------------------- +{ + "nightly-snapshots" : { + "version": 1, + "modified_date": "2019-04-23T01:30:00.000Z", + "modified_date_millis": 1556048137314, + "policy" : { + "schedule": "0 30 1 * * ?", + "name": "", + "repository": "my_repository", + "config": { + "indices": ["*"], + } + }, + "last_success": { <1> + "snapshot_name": "nightly-snap-2019.04.24-tmtnyjtrsxkhbrrdcgg18a", <2> + "time_string": "2019-04-24T16:43:49.316Z", + "time": 1556124229316 + } , + "last_failure": { <3> + "snapshot_name": "nightly-snap-2019.04.02-lohisb5ith2n8hxacaq3mw", + "time_string": "2019-04-02T01:30:00.000Z", + "time": 1556042030000, + "details": "{\"type\":\"index_not_found_exception\",\"reason\":\"no such index [important]\",\"resource.type\":\"index_or_alias\",\"resource.id\":\"important\",\"index_uuid\":\"_na_\",\"index\":\"important\",\"stack_trace\":\"[important] IndexNotFoundException[no such index [important]]\\n\\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver$WildcardExpressionResolver.indexNotFoundException(IndexNameExpressionResolver.java:762)\\n\\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver$WildcardExpressionResolver.innerResolve(IndexNameExpressionResolver.java:714)\\n\\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver$WildcardExpressionResolver.resolve(IndexNameExpressionResolver.java:670)\\n\\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.concreteIndices(IndexNameExpressionResolver.java:163)\\n\\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.concreteIndexNames(IndexNameExpressionResolver.java:142)\\n\\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.concreteIndexNames(IndexNameExpressionResolver.java:102)\\n\\tat org.elasticsearch.snapshots.SnapshotsService$1.execute(SnapshotsService.java:280)\\n\\tat org.elasticsearch.cluster.ClusterStateUpdateTask.execute(ClusterStateUpdateTask.java:47)\\n\\tat org.elasticsearch.cluster.service.MasterService.executeTasks(MasterService.java:687)\\n\\tat org.elasticsearch.cluster.service.MasterService.calculateTaskOutputs(MasterService.java:310)\\n\\tat org.elasticsearch.cluster.service.MasterService.runTasks(MasterService.java:210)\\n\\tat org.elasticsearch.cluster.service.MasterService$Batcher.run(MasterService.java:142)\\n\\tat org.elasticsearch.cluster.service.TaskBatcher.runIfNotProcessed(TaskBatcher.java:150)\\n\\tat org.elasticsearch.cluster.service.TaskBatcher$BatchedTask.run(TaskBatcher.java:188)\\n\\tat org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:688)\\n\\tat org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:252)\\n\\tat org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:215)\\n\\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)\\n\\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)\\n\\tat java.base/java.lang.Thread.run(Thread.java:834)\\n\"}" + } , + "next_execution": "2019-04-24T01:30:00.000Z", <4> + "next_execution_millis": 1556048160000 + } +} +-------------------------------------------------- +// TESTRESPONSE[skip:the presence of last_failure and last_success is asynchronous and will be present for users, but is untestable] +<1> information about the last time the policy successfully initated a snapshot +<2> the name of the snapshot that was successfully initiated +<3> information about the last time the policy failed to initiate a snapshot +<4> the is the next time the policy will execute + +NOTE: This metadata only indicates whether the request to initiate the snapshot was +made successfully or not - after the snapshot has been successfully started, it +is possible for the snapshot to fail if, for example, the connection to a remote +repository is lost while copying files. + +If you're following along, the returned SLM policy shouldn't have a `last_failure` +field - it's included above only as an example. You should, however, see a +`last_success` field and a snapshot name. If you do, you've successfully taken +your first snapshot using SLM! + +While only the most recent sucess and failure are available through the Get Policy +API, all policy executions are recorded to a history index, which may be queried +by searching the index pattern `.slm-history*`. + +That's it! We have our first SLM policy set up to periodically take snapshots +so that our backups are always up to date. You can read more details in the +<> and the +<> \ No newline at end of file diff --git a/docs/reference/ilm/index.asciidoc b/docs/reference/ilm/index.asciidoc index e9972a713dfb3..50d2e5f6dac22 100644 --- a/docs/reference/ilm/index.asciidoc +++ b/docs/reference/ilm/index.asciidoc @@ -55,6 +55,8 @@ separate APIs for managing snapshot lifecycles. Please see the <> documentation for information about configuring snapshots. +See <>. + [IMPORTANT] =========================== {ilm} does not support mixed-version cluster usage. Although it @@ -81,3 +83,5 @@ include::error-handling.asciidoc[] include::ilm-and-snapshots.asciidoc[] include::start-stop-ilm.asciidoc[] + +include::getting-started-slm.asciidoc[] diff --git a/docs/reference/modules/snapshots.asciidoc b/docs/reference/modules/snapshots.asciidoc index ec7916d5a3445..43d06e5e02adb 100644 --- a/docs/reference/modules/snapshots.asciidoc +++ b/docs/reference/modules/snapshots.asciidoc @@ -63,6 +63,7 @@ recommend testing the reindex from remote process with a subset of your data to understand the time requirements before proceeding. [float] +[[snapshots-repositories]] === Repositories You must register a snapshot repository before you can perform snapshot and @@ -322,6 +323,7 @@ POST /_snapshot/my_unverified_backup/_verify It returns a list of nodes where repository was successfully verified or an error message if verification process failed. [float] +[[snapshots-take-snapshot]] === Snapshot A repository can contain multiple snapshots of the same cluster. Snapshots are identified by unique names within the