Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change: Replace loosen-follower-log-revert feature flag with Config::allow_log_reversion #1268

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 0 additions & 3 deletions .github/workflows/ci.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -313,9 +313,6 @@ jobs:
- toolchain: "nightly"
features: "single-term-leader"

- toolchain: "nightly"
features: "loosen-follower-log-revert"


steps:
- name: Setup | Checkout
Expand Down
3 changes: 2 additions & 1 deletion openraft/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -93,6 +93,8 @@ singlethreaded = ["openraft-macros/singlethreaded"]
# useful for testing or other special scenarios.
# For instance, in an even number nodes cluster, erasing a node's data and then
# rebooting it(log reverts to empty) will not result in data loss.
#
# This feature is removed since `0.10.0`. Use `Config::allow_log_reversion` instead.
loosen-follower-log-revert = []


Expand All @@ -110,7 +112,6 @@ tracing-log = [ "tracing/log" ]
features = [
"bt",
"compat",
"loosen-follower-log-revert",
"serde",
"tracing-log",
]
Expand Down
28 changes: 28 additions & 0 deletions openraft/src/config/config.rs
Original file line number Diff line number Diff line change
Expand Up @@ -228,6 +228,26 @@ pub struct Config {
default_missing_value = "true"
)]
pub enable_elect: bool,

/// Whether to allow to reset the replication progress to `None`, when the
/// follower's log is found reverted to an early state. **Do not enable this in production**
/// unless you know what you are doing.
///
/// Although log state reversion is typically seen as a bug, enabling it can be
/// useful for testing or other special scenarios.
/// For instance, in an even number nodes cluster, erasing a node's data and then
/// rebooting it(log reverts to empty) will not result in data loss.
///
/// For one-shot log reversion, use
/// [`Raft::trigger().allow_next_revert()`](crate::raft::trigger::Trigger::allow_next_revert).
///
/// Since: 0.10.0
#[clap(long,
action = clap::ArgAction::Set,
num_args = 0..=1,
default_missing_value = "true"
)]
pub allow_log_reversion: Option<bool>,
}

/// Updatable config for a raft runtime.
Expand Down Expand Up @@ -276,6 +296,14 @@ impl Config {
}
}

/// Whether to allow the replication to reset the state to `None` when a log state reversion is
/// detected.
///
/// By default, it does not allow log reversion, because it might indicate a bug in the system.
pub(crate) fn get_allow_log_reversion(&self) -> bool {
self.allow_log_reversion.unwrap_or(false)
}

/// Build a `Config` instance from a series of command line arguments.
///
/// The first element in `args` must be the application name.
Expand Down
28 changes: 28 additions & 0 deletions openraft/src/config/config_test.rs
Original file line number Diff line number Diff line change
Expand Up @@ -162,3 +162,31 @@ fn test_config_enable_elect() -> anyhow::Result<()> {

Ok(())
}

#[test]
fn test_config_allow_log_reversion() -> anyhow::Result<()> {
let config = Config::build(&["foo", "--allow-log-reversion=false"])?;
assert_eq!(Some(false), config.allow_log_reversion);

let config = Config::build(&["foo", "--allow-log-reversion=true"])?;
assert_eq!(Some(true), config.allow_log_reversion);

let config = Config::build(&["foo", "--allow-log-reversion"])?;
assert_eq!(Some(true), config.allow_log_reversion);

let mut config = Config::build(&["foo"])?;
assert_eq!(None, config.allow_log_reversion);

// test allow_log_reversion method

config.allow_log_reversion = None;
assert_eq!(false, config.get_allow_log_reversion());

config.allow_log_reversion = Some(true);
assert_eq!(true, config.get_allow_log_reversion());

config.allow_log_reversion = Some(false);
assert_eq!(false, config.get_allow_log_reversion());

Ok(())
}
7 changes: 4 additions & 3 deletions openraft/src/docs/faq/faq.md
Original file line number Diff line number Diff line change
Expand Up @@ -184,7 +184,7 @@ behavior is **undefined**.
### Can I wipe out the data of ONE node and wait for the leader to replicate all data to it again?

Avoid doing this. Doing so will panic the leader. But it is permitted
if [`loosen-follower-log-revert`] feature flag is enabled.
with config [`Config::allow_log_reversion`] enabled.

In a raft cluster, although logs are replicated to multiple nodes,
wiping out a node and restarting it is still possible to cause data loss.
Expand All @@ -211,7 +211,7 @@ N5 | elect L2

But for even number nodes cluster, Erasing **exactly one** node won't cause data loss.
Thus, in a special scenario like this, or for testing purpose, you can use
`--feature loosen-follower-log-revert` to permit erasing a node.
[`Config::allow_log_reversion`] to permit erasing a node.


### Is Openraft resilient to incorrectly configured clusters?
Expand All @@ -237,7 +237,6 @@ pub(crate) fn following_handler(&mut self) -> FollowingHandler<C> {
```


[`loosen-follower-log-revert`]: `crate::docs::feature_flags#loosen_follower_log_revert`
[`single-term-leader`]: `crate::docs::feature_flags#single_term_leader`

[`Linearizable Read`]: `crate::docs::protocol::read`
Expand All @@ -246,6 +245,8 @@ pub(crate) fn following_handler(&mut self) -> FollowingHandler<C> {
[`BasicNode`]: `crate::node::BasicNode`
[`RaftTypeConfig`]: `crate::RaftTypeConfig`

[`Config::allow_log_reversion`]: `crate::config::Config::allow_log_reversion`

[`RaftLogStorage::save_committed()`]: `crate::storage::RaftLogStorage::save_committed`

[`RaftNetwork`]: `crate::network::RaftNetwork`
Expand Down
1 change: 0 additions & 1 deletion openraft/src/docs/feature_flags/feature-flags-toc.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
- [feature-flag `bench`](#feature-flag-bench)
- [feature-flag `bt`](#feature-flag-bt)
- [feature-flag `compat`](#feature-flag-compat)
- [feature-flag `loosen-follower-log-revert`](#feature-flag-loosen-follower-log-revert)
- [feature-flag `serde`](#feature-flag-serde)
- [feature-flag `single-term-leader`](#feature-flag-single-term-leader)
- [feature-flag `singlethreaded`](#feature-flag-singlethreaded)
Expand Down
9 changes: 0 additions & 9 deletions openraft/src/docs/feature_flags/feature-flags.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,15 +17,6 @@ This feature works ONLY with nightly rust, because it requires unstable feature

Enables compatibility supporting types.

## feature-flag `loosen-follower-log-revert`

Permit the follower's log to roll back to an earlier state without causing the leader to panic.
Although log state reversion is typically seen as a bug, enabling it can be useful for testing or other special scenarios.
For instance, in an even number nodes cluster,
erasing a node's data and then rebooting it(log reverts to empty) will not result in data loss.

**Do not use it unless you know what you are doing**.

## feature-flag `serde`

Derives `serde::Serialize, serde::Deserialize` for type that are used
Expand Down
5 changes: 5 additions & 0 deletions openraft/src/engine/engine_config.rs
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,8 @@ pub(crate) struct EngineConfig<C: RaftTypeConfig> {
/// The maximum number of entries per payload allowed to be transmitted during replication
pub(crate) max_payload_entries: u64,

pub(crate) allow_log_reversion: bool,

pub(crate) timer_config: time_state::Config,
}

Expand All @@ -39,6 +41,8 @@ where C: RaftTypeConfig
max_in_snapshot_log_to_keep: config.max_in_snapshot_log_to_keep,
purge_batch_size: config.purge_batch_size,
max_payload_entries: config.max_payload_entries,
allow_log_reversion: config.get_allow_log_reversion(),

timer_config: time_state::Config {
election_timeout,
smaller_log_timeout: Duration::from_millis(config.election_timeout_max * 2),
Expand All @@ -55,6 +59,7 @@ where C: RaftTypeConfig
max_in_snapshot_log_to_keep: 1000,
purge_batch_size: 256,
max_payload_entries: 300,
allow_log_reversion: false,
timer_config: time_state::Config::default(),
}
}
Expand Down
11 changes: 8 additions & 3 deletions openraft/src/engine/handler/replication_handler/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ use crate::engine::EngineOutput;
use crate::engine::ReplicationProgress;
use crate::error::NodeNotFound;
use crate::error::Operation;
use crate::progress;
use crate::progress::entry::ProgressEntry;
use crate::progress::Inflight;
use crate::progress::Progress;
Expand Down Expand Up @@ -163,7 +164,9 @@ where C: RaftTypeConfig
let quorum_accepted = self
.leader
.progress
.update_with(&node_id, |prog_entry| prog_entry.update_matching(log_id))
.update_with(&node_id, |prog_entry| {
prog_entry.new_updater(&*self.config).update_matching(log_id)
})
.expect("it should always update existing progress")
.clone();

Expand Down Expand Up @@ -215,7 +218,9 @@ where C: RaftTypeConfig

let prog_entry = self.leader.progress.get_mut(&target).unwrap();

prog_entry.update_conflicting(conflict.index);
let mut updater = progress::entry::update::Updater::new(self.config, prog_entry);

updater.update_conflicting(conflict.index);
}

/// Enable one-time replication reset for a specific node upon log reversion detection.
Expand All @@ -241,7 +246,7 @@ where C: RaftTypeConfig
return Err(NodeNotFound::new(target, Operation::AllowNextRevert));
};

prog_entry.reset_on_reversion = allow;
prog_entry.allow_log_reversion = allow;

Ok(())
}
Expand Down
4 changes: 2 additions & 2 deletions openraft/src/engine/tests/startup_test.rs
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,7 @@ fn test_startup_as_leader_without_logs() -> anyhow::Result<()> {
matching: None,
inflight: Inflight::None,
searching_end: 4,
reset_on_reversion: false,
allow_log_reversion: false,
})]
},
Command::AppendInputEntries {
Expand Down Expand Up @@ -130,7 +130,7 @@ fn test_startup_as_leader_with_proposed_logs() -> anyhow::Result<()> {
matching: None,
inflight: Inflight::None,
searching_end: 7,
reset_on_reversion: false,
allow_log_reversion: false,
})]
},
Command::Replicate {
Expand Down
6 changes: 6 additions & 0 deletions openraft/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,12 @@ macro_rules! func_name {
}};
}

#[cfg(feature = "loosen-follower-log-revert")]
compile_error!(
"The feature flag `loosen-follower-log-revert` is removed since `0.10.0`. \
Use `Config::allow_log_reversion` instead."
);

pub extern crate openraft_macros;

mod change_members;
Expand Down
84 changes: 11 additions & 73 deletions openraft/src/progress/entry/mod.rs
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
pub(crate) mod update;

use std::borrow::Borrow;
use std::error::Error;
use std::fmt::Debug;
Expand All @@ -7,6 +9,8 @@ use std::fmt::Formatter;
use validit::Validate;

use crate::display_ext::DisplayOptionExt;
use crate::engine::EngineConfig;
use crate::progress::entry::update::Updater;
use crate::progress::inflight::Inflight;
use crate::raft_state::LogStateReader;
use crate::LogId;
Expand Down Expand Up @@ -37,7 +41,7 @@ where C: RaftTypeConfig
/// to it.
///
/// This flag will be cleared after the progress entry is reset.
pub(crate) reset_on_reversion: bool,
pub(crate) allow_log_reversion: bool,
}

impl<C> ProgressEntry<C>
Expand All @@ -49,7 +53,7 @@ where C: RaftTypeConfig
matching: matching.clone(),
inflight: Inflight::None,
searching_end: matching.next_index(),
reset_on_reversion: false,
allow_log_reversion: false,
}
}

Expand All @@ -61,7 +65,7 @@ where C: RaftTypeConfig
matching: None,
inflight: Inflight::None,
searching_end: end,
reset_on_reversion: false,
allow_log_reversion: false,
}
}

Expand All @@ -74,6 +78,10 @@ where C: RaftTypeConfig
self
}

pub(crate) fn new_updater<'a>(&'a mut self, engine_config: &'a EngineConfig<C>) -> Updater<'a, C> {
Updater::new(engine_config, self)
}

/// Return if a range of log id `..=log_id` is inflight sending.
///
/// `prev_log_id` is never inflight.
Expand All @@ -88,76 +96,6 @@ where C: RaftTypeConfig
}
}

pub(crate) fn update_matching(&mut self, matching: Option<LogId<C::NodeId>>) {
tracing::debug!(
self = display(&self),
matching = display(matching.display()),
"update_matching"
);

self.inflight.ack(matching.clone());

debug_assert!(matching >= self.matching);
self.matching = matching;

let matching_next = self.matching.next_index();
self.searching_end = std::cmp::max(self.searching_end, matching_next);
}

/// Update conflicting log index.
///
/// Conflicting log index is the last found log index on a follower that is not matching the
/// leader log.
///
/// Usually if follower's data is lost, `conflict` is always greater than or equal `matching`.
/// But for testing purpose, a follower is allowed to clean its data and wait for leader to
/// replicate all data to it.
///
/// To allow a follower to clean its data, enable feature flag [`loosen-follower-log-revert`] .
///
/// [`loosen-follower-log-revert`]: crate::docs::feature_flags#feature_flag_loosen_follower_log_revert
pub(crate) fn update_conflicting(&mut self, conflict: u64) {
tracing::debug!(self = debug(&self), conflict = display(conflict), "update_conflict");

self.inflight.conflict(conflict);

debug_assert!(conflict < self.searching_end);
self.searching_end = conflict;

// An already matching log id is found lost:
//
// - If log reversion is allowed, just restart the binary search from the beginning.
// - Otherwise, panic it.

let allow_reset = if cfg!(feature = "loosen-follower-log-revert") {
true
} else {
self.reset_on_reversion
};

if allow_reset {
if conflict < self.matching.next_index() {
tracing::warn!(
"conflict {} < last matching {}: follower log is reverted; with 'loosen-follower-log-revert' enabled, this is allowed.",
conflict,
self.matching.display(),
);

self.matching = None;
self.reset_on_reversion = false;
}
} else {
debug_assert!(
conflict >= self.matching.next_index(),
"follower log reversion is not allowed \
without `--features loosen-follower-log-revert`; \
matching: {}; conflict: {}",
self.matching.display(),
conflict
);
}
}

/// Initialize a replication action: sending log entries or sending snapshot.
///
/// If there is an action in progress, i.e., `inflight` is not None, it returns an `Err`
Expand Down
Loading
Loading