Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
44 changes: 42 additions & 2 deletions doc/source/operations/control-plane-operation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -174,8 +174,30 @@ is advisable to migrate all of the instances to another machine. See
Ceph
----

The following guide provides a good overview:
https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/8/html/director_installation_and_usage/sect-rebooting-ceph
#. Check that the cluster is healthy (i.e. ``ceph -s``). Where possible, solve
or isolate any issues before the shutdown e.g. by marking unhealthy OSDs as
'out' in the cluster.

#. Stop all clients. This includes

* **All** OpenStack VMs (if their storage is RBD-backed).

* CephFS mounts.

* Ceph-backed OpenStack services such as Glance, Cinder, Manila, and RGW/S3/Swift.

#. Set the ``noout`` flag, so that the cluster does not attempt to redistribute
data when OSDs go down. Use the following command on a MON node:

.. code-block:: console

sudo cephadm shell -- ceph osd set noout

#. Shut down all the nodes, with those holding MON services last.

Note that if it is not desired for Ceph services to automatically start later
with the operating system, extra steps need to be taken and are not described
here.

Shutting down the seed VM
-------------------------
Expand All @@ -201,6 +223,24 @@ following order:
* Shut down seed VM
* Shut down Ansible control host

Full startup
-------------

If the entire control plane is powered down, it is best to bring the nodes up
in the reverse order of shutdown:

* Power on Ansible control host
* Power on seed VM (and other service VMs)
* Power on Ceph nodes (if applicable)
* Where possible, start the nodes running MON services first.
* Make sure that all OSD services are back up and running. At this point
it is safe to unset the ``noout`` cluster flag.
* Power on controllers
* Power on network nodes (if separate from controllers)
* Power on monitoring node (if separate from controllers)
* Power on compute nodes
* Power on virtual machines

Rebooting a node
----------------

Expand Down
Loading