Skip to content

3.0 config: update replication tutorials #3862

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Dec 7, 2023

Conversation

andreyaksenov
Copy link
Contributor

@andreyaksenov andreyaksenov commented Nov 17, 2023

Created 3 new tutorials for each failover mode:

The Configuring synchronous replication section is removed as it demonstrates not the best practices. Information about enabling sync replication added in the main topic and API docs:

Managing leader elections is moved to Concepts as it doesn't fit into the Tutorials section:

Added information about a new box.info.election.leader field:

Mentioned new tutorials in configuration reference for the replication.failover option:

@andreyaksenov andreyaksenov force-pushed the 3.0-config-replication-tutorials branch 9 times, most recently from e15a38f to 2b08f1d Compare December 1, 2023 08:50
@andreyaksenov andreyaksenov marked this pull request as ready for review December 5, 2023 06:53
@andreyaksenov andreyaksenov force-pushed the 3.0-config-replication-tutorials branch 2 times, most recently from 2c8e3b6 to 8aedebc Compare December 5, 2023 07:25
Copy link
Member

@Totktonada Totktonada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw several drafts of this patchset and it was generally OK for me. I'm going to approve without looking over one more time.

I would only note that it is easy to become entangled with all these replication reconfiguration steps. I would illustrate them with some pictures, if possible (of course, it is not a blocker for this pull request).

Copy link
Contributor

@p7nov p7nov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've checked:

They're fine by me, just some minor improvement thoughts.

replication source(s), and
* :ref:`read_only <cfg_basic-read_only>` which is ``true`` for a
replica and ``false`` for a master.
Prerequisites
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel a lack of some general description of what we'll do. Something like "We'll start a cluster with a manual failover, check how the replication works, and switch master manually."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally forgot about intros, added.

Reloading configuration
~~~~~~~~~~~~~~~~~~~~~~~

After adding ``instance003`` to the configuration and starting it, configurations on all instances should be reloaded to allow ``instance001`` and ``instance002`` to get data from the new instance in case it becomes a master:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is exactly the case where passive voice confuses readers :)
I couldn't understand by whom the configuration should be reloaded until I read the step 2 below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, will fix :)

Removing an instance from the configuration
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

1. Remove ``instance003`` from the ``instances.yml`` file:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: it would be natural to remove instance001, since we've already took leadership from it.


.. _replication-automated-failover-tt-env:

Prerequisites
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same about the description.

...

4. Execute ``box.info.replication`` to check a replica set status.
Make sure that ``upstream.status`` and ``downstream.status`` are ``follow`` for ``instance002`` and ``instance003``.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Make sure that ``upstream.status`` and ``downstream.status`` are ``follow`` for ``instance002`` and ``instance003``.
Make sure that ``upstream.status`` and ``downstream.status`` are ``follow`` for ``instance001`` and ``instance003``.

Copy link
Contributor Author

@andreyaksenov andreyaksenov Dec 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sure that upstream.status and downstream.status are follow for instance002 and instance003.

Looks like the old description is correct, returned back:

image

Adding data
~~~~~~~~~~~

To check that replicas (``instance001`` and ``instance003``) get all updates from the master(``instance002``), follow the steps below:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
To check that replicas (``instance001`` and ``instance003``) get all updates from the master(``instance002``), follow the steps below:
To check that replicas (``instance001`` and ``instance003``) get all updates from the master (``instance002``), follow the steps below:


3. Use the ``select`` operation on ``instance001`` and ``instance003`` to make sure data is replicated.

4. Check that the 1-st component of :ref:`box.info.vclock <box_introspection-box_info>` values are the same on all instances:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1-st looks a bit weird.
I get that it's not the first but corresponding to key 1. Maybe call it "1 component" ("1" digit in code style)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, took this from the old changelog. Will fix.

Choosing a leader manually
--------------------------

1. Make sure that :ref:`box.info.vclock <box_introspection-box_info>` values (excluding the 0-th components) are the same on all instances:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe excluding > except?

@andreyaksenov andreyaksenov force-pushed the 3.0-config-replication-tutorials branch 2 times, most recently from b189d5f to 9af33f9 Compare December 5, 2023 12:07
Comment on lines 228 to 230
Leader election doesn't work correctly if the election quorum is set to less or equal
than ``<cluster size> / 2`` because in that case, a split vote can lead to
a state when two leaders are elected at once.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Leader election doesn't work correctly if the election quorum is set to less or equal
than ``<cluster size> / 2`` because in that case, a split vote can lead to
a state when two leaders are elected at once.
Leader election doesn't work correctly if the election quorum is set to less or equal
than ``<cluster size> / 2``. In that case, a split vote can lead to
a state when two leaders are elected at once.

Step 1: Configuring a failover mode
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

First, set the :ref:`replication.failover <configuration_reference_replication_failover>` option to ``manual``:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
First, set the :ref:`replication.failover <configuration_reference_replication_failover>` option to ``manual``:
Set the :ref:`replication.failover <configuration_reference_replication_failover>` option to ``manual``:

...

3. Execute ``box.info.replication`` to check a replica set status.
For ``instance002``, ``upstream.status`` and ``downstream.status`` should be ``follow``.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
For ``instance002``, ``upstream.status`` and ``downstream.status`` should be ``follow``.
For ``instance001``, ``upstream.status`` and ``downstream.status`` should be ``follow``.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the current description is correct:

image

@andreyaksenov andreyaksenov force-pushed the 3.0-config-replication-tutorials branch from b195ffe to 1a0aa6f Compare December 7, 2023 07:10
@andreyaksenov andreyaksenov merged commit 3eab90f into 3.0 Dec 7, 2023
@andreyaksenov andreyaksenov deleted the 3.0-config-replication-tutorials branch December 7, 2023 07:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants