Skip to content

3.0 configuration: replication administration #3890

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Dec 21, 2023

Conversation

andreyaksenov
Copy link
Contributor

@andreyaksenov andreyaksenov commented Dec 1, 2023

Updated topics related replication administration. Looks like all these topics might be structured better to avoid duplication. Will create a separate issue for this task.

Replication administration:

Misc:

Added diagrams to replication tutorials:

@andreyaksenov andreyaksenov force-pushed the 3.0-config-replication-administration branch from 12a1899 to 0e7b52a Compare December 4, 2023 12:06
@andreyaksenov andreyaksenov force-pushed the 3.0-config-replication-tutorials branch 2 times, most recently from 2c8e3b6 to 8aedebc Compare December 5, 2023 07:25
@andreyaksenov andreyaksenov force-pushed the 3.0-config-replication-administration branch 4 times, most recently from 870fd8d to 684dc1e Compare December 5, 2023 09:38
@andreyaksenov andreyaksenov force-pushed the 3.0-config-replication-tutorials branch from b189d5f to 9af33f9 Compare December 5, 2023 12:07
@andreyaksenov andreyaksenov force-pushed the 3.0-config-replication-administration branch 10 times, most recently from 4860379 to 1f7cd89 Compare December 6, 2023 12:11
@andreyaksenov andreyaksenov force-pushed the 3.0-config-replication-tutorials branch from b195ffe to 1a0aa6f Compare December 7, 2023 07:10
Base automatically changed from 3.0-config-replication-tutorials to 3.0 December 7, 2023 07:19
@andreyaksenov andreyaksenov force-pushed the 3.0-config-replication-administration branch 2 times, most recently from 6404238 to 65ce588 Compare December 7, 2023 10:09
@andreyaksenov andreyaksenov marked this pull request as ready for review December 7, 2023 10:28
@andreyaksenov andreyaksenov force-pushed the 3.0-config-replication-administration branch from 65ce588 to c67bd0f Compare December 7, 2023 11:56

.. code-block:: console
5. Remove a crashed master from a replica set as described in :ref:`Removing instances <replication-remove_instances>`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we link how-to/replication/repl_bootstrap/#removing-an-instance-from-the-cluster-space instead?

I mean, if the replacement instance from the next (sixth) step is supposed to have the same name as the old one, we need to just drop an entry from _cluster and start the instance somewhere again. (It also means that Adding instances section is not very relevant.)

So, the minimal working steps instead of these 5-6 ones are the following:

  1. Remove the instance from the _cluster space.
  2. Start the instance again on a spare host.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, fixed:

image

Comment on lines -41 to -44
You lose the few transactions in the master
:ref:`write ahead log file <index-box_persistence>`, which it may have not
transferred to the replica before crash. If you were able to salvage the master
.xlog file, you may be able to recover these. In order to do it:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The note about losing the transactions is important.

I also like the recipe how to repair it.

(Well, I don't like that it is not very user friendly, but it is a way to solve the problem.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see now, it is described later. Maybe a 'see also' note would be helpful.

Copy link
Contributor Author

@andreyaksenov andreyaksenov Dec 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion. I believe that having this info in a separate subsection (compared to the description in old docs) improves the visibility of this info. Not sure that we should add separate notes inside the Master crash: manual failover and Master crash: automated failover subsections because the Data loss subsection is already visible in the Table of contents at the right.
image


.. _admin-disaster_recovery-data_loss:

--------------------------------------------------------------------------------
Data loss
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is confusing to have two 'data loss' sections on different levels.

Copy link
Contributor Author

@andreyaksenov andreyaksenov Dec 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed to Master-replica/master-master: data loss for consistency. Still not the best variant but I feel that the entire Administration section will be restructured gradually when migrating other topics to a new config.

image

Depending on the :ref:`replication.failover <configuration_reference_replication_failover>` mode, this can be done as follows:

- ``manual``: change a replica set leader to ``null``.
- ``election``: switch from the ``election`` failover mode to ``manual`` and change a replica set leader to ``null``.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't change the replication.failover mode if there is a way solve the problem within the mode. Here we can keep replication.failover: election and set replication.election_mode to voter or off.

See details here.

Copy link
Contributor Author

@andreyaksenov andreyaksenov Dec 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, fixed.
image

@andreyaksenov andreyaksenov force-pushed the 3.0-config-replication-administration branch 3 times, most recently from a99749b to c654c92 Compare December 18, 2023 07:49
Copy link
Contributor

@sergepetrenko sergepetrenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this!
Please find my comments below.

@andreyaksenov andreyaksenov force-pushed the 3.0-config-replication-administration branch 2 times, most recently from 86a1feb to d118e0c Compare December 19, 2023 15:55
Copy link
Contributor

@sergepetrenko sergepetrenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the patch!

@andreyaksenov andreyaksenov requested a review from xuniq December 20, 2023 07:13
@@ -4,25 +4,19 @@
Reseeding a replica
================================================================================

If any of a replica's .xlog/.snap/.run files are corrupted or deleted, you can
"re-seed" the replica:
If any of a replica's write-ahead log or snapshot files are corrupted or deleted, you can "re-seed" the replica.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
If any of a replica's write-ahead log or snapshot files are corrupted or deleted, you can "re-seed" the replica.
If any of a replica's write-ahead log or snapshot files are corrupted or deleted, you can reseed the replica.

Comment on lines +46 to +48
.. include:: /how-to/replication/repl_bootstrap_auto.rst
:start-after: box_info_replication_auto_leader_disconnected_start
:end-before: box_info_replication_auto_leader_disconnected_end
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the way how a small part of the page is reused here - with :start-after: and :end-before:. Will definitely use this in future.

@andreyaksenov andreyaksenov force-pushed the 3.0-config-replication-administration branch from d118e0c to eba14ef Compare December 21, 2023 17:18
@andreyaksenov andreyaksenov merged commit 3afd11e into 3.0 Dec 21, 2023
@andreyaksenov andreyaksenov deleted the 3.0-config-replication-administration branch December 21, 2023 17:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Config] Update the 'Replication administration' section to using a new config
4 participants