3.0 configuration: replication administration #3890

andreyaksenov · 2023-12-01T13:59:33Z

Updated topics related replication administration. Looks like all these topics might be structured better to avoid duplication. Will create a separate issue for this task.

Replication administration:

Monitoring a replica set (includes new diagrams for box.info.replication)
Recovering from a degraded state
Reseeding a replica
Resolving replication conflicts (added a link to the Resolving replication conflicts section at the end of the Preventing duplicate insert section)

Misc:

Added diagrams to replication tutorials:

Totktonada · 2023-12-16T23:47:18Z

doc/book/admin/disaster_recovery.rst


-      .. code-block:: console
+5.  Remove a crashed master from a replica set as described in :ref:`Removing instances <replication-remove_instances>`.


Shouldn't we link how-to/replication/repl_bootstrap/#removing-an-instance-from-the-cluster-space instead?

I mean, if the replacement instance from the next (sixth) step is supposed to have the same name as the old one, we need to just drop an entry from _cluster and start the instance somewhere again. (It also means that Adding instances section is not very relevant.)

So, the minimal working steps instead of these 5-6 ones are the following:

Remove the instance from the _cluster space.

Start the instance again on a spare host.

Thanks, fixed:

Totktonada · 2023-12-16T23:48:50Z

doc/book/admin/disaster_recovery.rst

-You lose the few transactions in the master
-:ref:`write ahead log file <index-box_persistence>`, which it may have not
-transferred to the replica before crash. If you were able to salvage the master
-.xlog file, you may be able to recover these. In order to do it:


The note about losing the transactions is important.

I also like the recipe how to repair it.

(Well, I don't like that it is not very user friendly, but it is a way to solve the problem.)

I see now, it is described later. Maybe a 'see also' note would be helpful.

Thanks for the suggestion. I believe that having this info in a separate subsection (compared to the description in old docs) improves the visibility of this info. Not sure that we should add separate notes inside the Master crash: manual failover and Master crash: automated failover subsections because the Data loss subsection is already visible in the Table of contents at the right.

Totktonada · 2023-12-16T23:58:18Z

doc/book/admin/disaster_recovery.rst


 .. _admin-disaster_recovery-data_loss:

--------------------------------------------------------------------------------
 Data loss


It is confusing to have two 'data loss' sections on different levels.

Renamed to Master-replica/master-master: data loss for consistency. Still not the best variant but I feel that the entire Administration section will be restructured gradually when migrating other topics to a new config.

Totktonada · 2023-12-17T00:03:45Z

doc/book/admin/disaster_recovery.rst

+    Depending on the :ref:`replication.failover <configuration_reference_replication_failover>` mode, this can be done as follows:
+
+    -   ``manual``: change a replica set leader to ``null``.
+    -   ``election``: switch from the ``election`` failover mode to ``manual`` and change a replica set leader to ``null``.


I wouldn't change the replication.failover mode if there is a way solve the problem within the mode. Here we can keep replication.failover: election and set replication.election_mode to voter or off.

See details here.

Thanks, fixed.

sergepetrenko

Thanks for working on this!
Please find my comments below.

doc/book/admin/replication/repl_monitoring.rst

doc/book/admin/replication/repl_recover.rst

doc/book/admin/replication/repl_reseed.rst

doc/book/admin/replication/repl_problem_solving.rst

doc/book/admin/disaster_recovery.rst

doc/book/admin/troubleshoot.rst

sergepetrenko

Thanks for the patch!

xuniq · 2023-12-21T14:15:56Z

doc/book/admin/replication/repl_reseed.rst

@@ -4,25 +4,19 @@
 Reseeding a replica
 ================================================================================

-If any of a replica's .xlog/.snap/.run files are corrupted or deleted, you can
-"re-seed" the replica:
+If any of a replica's write-ahead log or snapshot files are corrupted or deleted, you can "re-seed" the replica.


Suggested change

If any of a replica's write-ahead log or snapshot files are corrupted or deleted, you can "re-seed" the replica.

If any of a replica's write-ahead log or snapshot files are corrupted or deleted, you can reseed the replica.

xuniq · 2023-12-21T14:30:36Z

doc/book/admin/replication/repl_recover.rst

+..  include:: /how-to/replication/repl_bootstrap_auto.rst
+    :start-after: box_info_replication_auto_leader_disconnected_start
+    :end-before: box_info_replication_auto_leader_disconnected_end


I like the way how a small part of the page is reused here - with :start-after: and :end-before:. Will definitely use this in future.

andreyaksenov linked an issue Dec 1, 2023 that may be closed by this pull request

[Config] Update the 'Replication administration' section to using a new config #3879

Closed

andreyaksenov force-pushed the 3.0-config-replication-administration branch from 12a1899 to 0e7b52a Compare December 4, 2023 12:06

andreyaksenov force-pushed the 3.0-config-replication-tutorials branch 2 times, most recently from 2c8e3b6 to 8aedebc Compare December 5, 2023 07:25

andreyaksenov force-pushed the 3.0-config-replication-administration branch 4 times, most recently from 870fd8d to 684dc1e Compare December 5, 2023 09:38

andreyaksenov force-pushed the 3.0-config-replication-tutorials branch from b189d5f to 9af33f9 Compare December 5, 2023 12:07

andreyaksenov force-pushed the 3.0-config-replication-administration branch 10 times, most recently from 4860379 to 1f7cd89 Compare December 6, 2023 12:11

andreyaksenov force-pushed the 3.0-config-replication-tutorials branch from b195ffe to 1a0aa6f Compare December 7, 2023 07:10

Base automatically changed from 3.0-config-replication-tutorials to 3.0 December 7, 2023 07:19

andreyaksenov force-pushed the 3.0-config-replication-administration branch 2 times, most recently from 6404238 to 65ce588 Compare December 7, 2023 10:09

andreyaksenov marked this pull request as ready for review December 7, 2023 10:28

andreyaksenov requested a review from Totktonada December 7, 2023 10:30

andreyaksenov force-pushed the 3.0-config-replication-administration branch from 65ce588 to c67bd0f Compare December 7, 2023 11:56

Totktonada reviewed Dec 16, 2023

View reviewed changes

Totktonada reviewed Dec 17, 2023

View reviewed changes

andreyaksenov force-pushed the 3.0-config-replication-administration branch 3 times, most recently from a99749b to c654c92 Compare December 18, 2023 07:49

sergepetrenko reviewed Dec 19, 2023

View reviewed changes

andreyaksenov force-pushed the 3.0-config-replication-administration branch 2 times, most recently from 86a1feb to d118e0c Compare December 19, 2023 15:55

sergepetrenko approved these changes Dec 20, 2023

View reviewed changes

andreyaksenov requested a review from xuniq December 20, 2023 07:13

andreyaksenov added 4 commits December 21, 2023 17:43

3.0 config: replication administration

f17b634

3.0 config: replication administration - add diagrams

ea86d1e

3.0 config: replication administration - update per review

437c266

3.0 config: replication administration - update per review 2

f8e0131

xuniq approved these changes Dec 21, 2023

View reviewed changes

3.0 config: replication administration: update per review 3

eba14ef

andreyaksenov force-pushed the 3.0-config-replication-administration branch from d118e0c to eba14ef Compare December 21, 2023 17:18

andreyaksenov merged commit 3afd11e into 3.0 Dec 21, 2023

andreyaksenov deleted the 3.0-config-replication-administration branch December 21, 2023 17:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

3.0 configuration: replication administration #3890

3.0 configuration: replication administration #3890

andreyaksenov commented Dec 1, 2023 •

edited

Loading

Totktonada Dec 16, 2023

andreyaksenov Dec 18, 2023

Totktonada Dec 16, 2023

Totktonada Dec 16, 2023

andreyaksenov Dec 18, 2023 •

edited

Loading

Totktonada Dec 16, 2023

andreyaksenov Dec 18, 2023 •

edited

Loading

Totktonada Dec 17, 2023

andreyaksenov Dec 18, 2023 •

edited

Loading

sergepetrenko left a comment

sergepetrenko left a comment

xuniq Dec 21, 2023

xuniq Dec 21, 2023


		.. code-block:: console
		5. Remove a crashed master from a replica set as described in :ref:`Removing instances <replication-remove_instances>`.

	If any of a replica's write-ahead log or snapshot files are corrupted or deleted, you can "re-seed" the replica.
	If any of a replica's write-ahead log or snapshot files are corrupted or deleted, you can reseed the replica.

3.0 configuration: replication administration #3890

3.0 configuration: replication administration #3890

Conversation

andreyaksenov commented Dec 1, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

andreyaksenov Dec 18, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

andreyaksenov Dec 18, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

andreyaksenov Dec 18, 2023 • edited Loading

Choose a reason for hiding this comment

sergepetrenko left a comment

Choose a reason for hiding this comment

sergepetrenko left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

andreyaksenov commented Dec 1, 2023 •

edited

Loading

andreyaksenov Dec 18, 2023 •

edited

Loading

andreyaksenov Dec 18, 2023 •

edited

Loading

andreyaksenov Dec 18, 2023 •

edited

Loading