Skip to content

Commit

Permalink
RS: Recover cluster - removed incorrect rladmin cluster join syntax, …
Browse files Browse the repository at this point in the history
…removed expands, added reference links (#3206)

* DOC-2950 Removed incorrect rladmin cluster join syntax, removed expands, added reference links

* Typo fix
  • Loading branch information
rrelledge authored Mar 22, 2024
1 parent 8c2171c commit cd38413
Showing 1 changed file with 18 additions and 61 deletions.
79 changes: 18 additions & 61 deletions content/rs/clusters/cluster-recovery.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,23 +16,22 @@ When a Redis Enterprise Software cluster fails,
you must use the cluster configuration file and database data to recover the cluster.

{{< note >}}
For cluster recovery in a Kubernetes deployment, go to: [Recover a Redis Enterprise cluster on Kubernetes]({{< relref "/kubernetes/re-clusters/cluster-recovery.md" >}}).
For cluster recovery in a Kubernetes deployment, see [Recover a Redis Enterprise cluster on Kubernetes]({{< relref "/kubernetes/re-clusters/cluster-recovery" >}}).
{{< /note >}}

Cluster failure can be caused by:

- A hardware or software failure that causes the cluster to be unresponsive to client requests or administrative actions.
- More than half of the cluster nodes lose connection with the cluster, resulting in quorum loss.

To recover a cluster and re-create it as it was before the failure
you must restore the cluster configuration (ccs-redis.rdb) to the cluster nodes.
To restore the data that was in the databases to databases in the new cluster
you must restore the database persistence files (backup, AOF, or snapshot files) to the databases.
To recover a cluster and re-create it as it was before the failure,
you must restore the cluster configuration `ccs-redis.rdb` to the cluster nodes.
To recover databases in the new cluster, you must restore the databases from persistence files such as backup files, append-only files (AOF), or RDB snapshots.
These files are stored in the [persistent storage location]({{< relref "/rs/installing-upgrading/install/plan-deployment/persistent-ephemeral-storage" >}}).

The cluster recovery process includes:

1. Install RS on the nodes of the new cluster.
1. Install Redis Enterprise Software on the nodes of the new cluster.
1. Mount the persistent storage with the recovery files from the original cluster to the nodes of the new cluster.
1. Recover the cluster configuration on the first node in the new cluster.
1. Join the remaining nodes to the new cluster.
Expand All @@ -45,23 +44,21 @@ The cluster recovery process includes:
make sure there are no Redis processes running on any nodes in the new cluster.
- We recommend that you use clean persistent storage drives for the new cluster.
If you use the original storage drives,
make sure that you backup the files on the original storage drives to a safe location.
make sure you back up the files on the original storage drives to a safe location.
- Identify the cluster configuration file that you want to use as the configuration for the recovered cluster.
The cluster configuration file is `/css/ccs-redis.rdb` on the persistent storage for each node.

## Recovering the cluster
## Recover the cluster

1. (Optional) If you want to recover the cluster to the original cluster nodes, uninstall RS from the nodes.
1. (Optional) If you want to recover the cluster to the original cluster nodes, uninstall Redis Enterprise Software from the nodes.

1. Install [RS]({{< relref "/rs/installing-upgrading" >}}) on the new cluster nodes.

Do not configure the cluster nodes (`rladmin cluster create` in the CLI or **Setup** in the admin console).
1. [Install Redis Enterprise Software]({{< relref "/rs/installing-upgrading/install/install-on-linux" >}}) on the new cluster nodes.

The new servers must have the same basic hardware and software configuration as the original servers, including:

- The same number of nodes
- At least the same amount of memory
- The same RS version
- The same Redis Enterprise Software version
- The same installation user and paths

{{< note >}}
Expand All @@ -78,31 +75,12 @@ of the configuration and persistence files on each of the nodes.

If you use local persistent storage, place all of the recovery files on each of the cluster nodes.

1. To recover the cluster configuration from the original cluster to the first node in the new cluster,
from the [`rladmin`]({{<relref "/rs/references/cli-utilities/rladmin">}}) command-line interface (CLI):
1. To recover the original cluster configuration, run [`rladmin cluster recover`]({{<relref "/rs/references/cli-utilities/rladmin/cluster/recover">}}) on the first node in the new cluster:

```sh
rladmin cluster recover filename [ <persistent_path> | <ephemeral_path> ]<filename> node_uid <node_uid> rack_id <rack_id>
```

{{% expand "Command syntax" %}}
`<filename>` - The full path of the old cluster configuration file in the persistent storage.
The cluster configuration file is `/css/ccs-redis.rdb`.

`<node_uid>` - The id of the node, in this case `1`.

`<persistent_path>` (optional) - The location of the [persistent storage ]({{< relref "/rs/installing-upgrading/install/plan-deployment/persistent-ephemeral-storage" >}})
in the new node.

`<ephemeral_path>` (optional) - The location of the [ephemeral storage]({{< relref "/rs/installing-upgrading/install/plan-deployment/persistent-ephemeral-storage" >}})
in the new node.

`<rack_id>` (optional) - If [rack-zone awareness]({{< relref "/rs/clusters/configure/rack-zone-awareness.md" >}})
was enabled in the cluster,
you can use this parameter to override the rack ID value that was set for the node with ID 1 with a new rack ID.
Otherwise, the node gets the same rack ID as the original node.
{{% /expand %}}

For example:

```sh
Expand All @@ -112,44 +90,23 @@ Otherwise, the node gets the same rack ID as the original node.
When the recovery command succeeds,
this node is configured as the node from the old cluster that has ID 1.

1. To join the remaining servers to the new cluster, run `rladmin cluster join` from each new node:
1. To join the remaining servers to the new cluster, run [`rladmin cluster join`]({{<relref "/rs/references/cli-utilities/rladmin/cluster/join">}}) from each new node:

```sh
rladmin cluster join [ nodes <cluster_member_ip_address> | name <cluster_FQDN> ] username <username> password <password> replace_node <node_id>
rladmin cluster join nodes <cluster_member_ip_address> username <username> password <password> replace_node <node_id>
```

{{% expand "Command syntax" %}}
`nodes` - The IP address of a node in the cluster that this node is joining.

`name` - The [FQDN name]({{< relref "/rs/networking/cluster-dns" >}})
of the cluster this node is joining.

`username` - The email address of the cluster administrator.

`password` - The password of the cluster administrator.

`replace_node` - The ID of the node that this node replaces from the old cluster.

`persistent_path` (optional) - The location of the [persistent storage]({{< relref "/rs/installing-upgrading/install/plan-deployment/persistent-ephemeral-storage" >}})
in the new node.

`ephemeral_path` (optional) - The location of the [ephemeral storage]({{< relref "/rs/installing-upgrading/install/plan-deployment/persistent-ephemeral-storage" >}})
in the new node.

`rack_id` (optional) - If [rack-zone awareness]({{< relref "/rs/clusters/configure/rack-zone-awareness.md" >}}) was enabled in the cluster,
use this parameter to set the rack ID to be the same as the rack ID
of the old node. You can also change the value of the rack ID by
providing a different value and using the `override_rack_id` flag.
{{% /expand %}}

For example:

```sh
rladmin cluster join nodes 10.142.0.4 username admin@example.com password mysecret replace_node 2
```

You can run the `rladmin status` command to verify that the recovered nodes are now active,
and that the databases are pending recovery.
1. Run [`rladmin status`]({{<relref "/rs/references/cli-utilities/rladmin/status">}}) to verify the recovered nodes are now active and the databases are pending recovery:

```sh
rladmin status
```

{{< note >}}
Make sure that you update your [DNS records]({{< relref "/rs/networking/cluster-dns" >}})
Expand Down

0 comments on commit cd38413

Please sign in to comment.