-
Notifications
You must be signed in to change notification settings - Fork 43
3.0 configuration: replication administration #3890
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
12a1899
to
0e7b52a
Compare
2c8e3b6
to
8aedebc
Compare
870fd8d
to
684dc1e
Compare
b189d5f
to
9af33f9
Compare
4860379
to
1f7cd89
Compare
b195ffe
to
1a0aa6f
Compare
6404238
to
65ce588
Compare
65ce588
to
c67bd0f
Compare
doc/book/admin/disaster_recovery.rst
Outdated
|
||
.. code-block:: console | ||
5. Remove a crashed master from a replica set as described in :ref:`Removing instances <replication-remove_instances>`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we link how-to/replication/repl_bootstrap/#removing-an-instance-from-the-cluster-space
instead?
I mean, if the replacement instance from the next (sixth) step is supposed to have the same name as the old one, we need to just drop an entry from _cluster
and start the instance somewhere again. (It also means that Adding instances
section is not very relevant.)
So, the minimal working steps instead of these 5-6 ones are the following:
- Remove the instance from the _cluster space.
- Start the instance again on a spare host.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You lose the few transactions in the master | ||
:ref:`write ahead log file <index-box_persistence>`, which it may have not | ||
transferred to the replica before crash. If you were able to salvage the master | ||
.xlog file, you may be able to recover these. In order to do it: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The note about losing the transactions is important.
I also like the recipe how to repair it.
(Well, I don't like that it is not very user friendly, but it is a way to solve the problem.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see now, it is described later. Maybe a 'see also' note would be helpful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestion. I believe that having this info in a separate subsection (compared to the description in old docs) improves the visibility of this info. Not sure that we should add separate notes inside the Master crash: manual failover
and Master crash: automated failover
subsections because the Data loss
subsection is already visible in the Table of contents at the right.
doc/book/admin/disaster_recovery.rst
Outdated
|
||
.. _admin-disaster_recovery-data_loss: | ||
|
||
-------------------------------------------------------------------------------- | ||
Data loss |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is confusing to have two 'data loss' sections on different levels.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doc/book/admin/disaster_recovery.rst
Outdated
Depending on the :ref:`replication.failover <configuration_reference_replication_failover>` mode, this can be done as follows: | ||
|
||
- ``manual``: change a replica set leader to ``null``. | ||
- ``election``: switch from the ``election`` failover mode to ``manual`` and change a replica set leader to ``null``. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't change the replication.failover
mode if there is a way solve the problem within the mode. Here we can keep replication.failover: election
and set replication.election_mode
to voter
or off
.
See details here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a99749b
to
c654c92
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this!
Please find my comments below.
86a1feb
to
d118e0c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the patch!
@@ -4,25 +4,19 @@ | |||
Reseeding a replica | |||
================================================================================ | |||
|
|||
If any of a replica's .xlog/.snap/.run files are corrupted or deleted, you can | |||
"re-seed" the replica: | |||
If any of a replica's write-ahead log or snapshot files are corrupted or deleted, you can "re-seed" the replica. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If any of a replica's write-ahead log or snapshot files are corrupted or deleted, you can "re-seed" the replica. | |
If any of a replica's write-ahead log or snapshot files are corrupted or deleted, you can reseed the replica. |
.. include:: /how-to/replication/repl_bootstrap_auto.rst | ||
:start-after: box_info_replication_auto_leader_disconnected_start | ||
:end-before: box_info_replication_auto_leader_disconnected_end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the way how a small part of the page is reused here - with :start-after:
and :end-before:
. Will definitely use this in future.
d118e0c
to
eba14ef
Compare
Updated topics related replication administration. Looks like all these topics might be structured better to avoid duplication. Will create a separate issue for this task.
Replication administration:
box.info.replication
)Resolving replication conflicts
section at the end of thePreventing duplicate insert
section)Misc:
Added diagrams to replication tutorials: