Implement PJR checker (#8160)

* Apply. * get rid of glob import * use meaningful generic type name * pjr_check operates on `Supports` struct used elsewhere * improve algorithmic complexity of `prepare_pjr_input` * fix rustdoc warnings * improve module docs * typo * simplify debug assertion * add test finding the phase-change threshold value for a constructed scenario * add more threshold scenarios to disambiguate plausible interpretations * add link to npos paper reference * docs: staked_assignment -> supports Co-authored-by: Kian Paimani <5588131+kianenigma@users.noreply.github.com> * add utility method for generating npos inputs * add a fuzzer which asserts that all unbalanced seq_phragmen are PJR Note that this currently fails. I hope that this can be rectified by calculating the threshold instead of choosing some arbitrary number. * assert in all cases, not just debug * leverage a native solution to choose candidates * use existing helper methods * add pjr-check and incorporate into the fuzzer We should probably have one of the W3F people look at this to ensure we're not misconstruing any definitions, but this seems like a fairly straightforward implementation. * fix compilation errors * Enable manually setting iteration parameters in single run. This gives us the ability to reproducably extract cases where honggfuzz has discovered a panic. For example: $ cargo run --release --bin phragmen_pjr -- --candidates 569 --voters 100 Tue 23 Feb 2021 11:23:39 AM CET Compiling bitflags v1.2.1 Compiling unicode-width v0.1.8 Compiling unicode-segmentation v1.7.1 Compiling ansi_term v0.11.0 Compiling strsim v0.8.0 Compiling vec_map v0.8.2 Compiling proc-macro-error-attr v1.0.4 Compiling proc-macro-error v1.0.4 Compiling textwrap v0.11.0 Compiling atty v0.2.14 Compiling heck v0.3.2 Compiling clap v2.33.3 Compiling structopt-derive v0.4.14 Compiling structopt v0.3.21 Compiling sp-npos-elections-fuzzer v2.0.0-alpha.5 (/home/coriolinus/Documents/Projects/paritytech/substrate/primitives/npos-elections/fuzzer) Finished release [optimized] target(s) in 6.15s Running `/home/coriolinus/Documents/Projects/paritytech/substrate/target/release/phragmen_pjr -c 569 -v 100` thread 'main' panicked at 'unbalanced sequential phragmen must satisfy PJR', primitives/npos-elections/fuzzer/src/phragmen_pjr.rs:133:5 note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace This is still not adequate proof that seq_phragmen is broken; it could very well be that our PJR checker is doing the wrong thing, or we've somehow missed a parameter of interest. Still, it's concerning. * update comment verbiage for accuracy * it is valid in PJR for an elected candidate to have 0 support * Fix phragmen_pjr fuzzer It turns out that the fundamental problem causing previous implementations of the fuzzer to fail wasn't in `seq_phragmen` _or_ in `pjr_check`: it was in the rounding errors introduced in the various conversions between the internal data representation and the external one. Fixing the fuzzer is then simply an issue of using the internal representation and staying in that representation. However, that leaves the issue that `seq_phragmen` occasionally produces an output which is technically not PJR due to rounding errors. In the future we will need to add some kind of "close-enough" threshold. However, that is explicitly out of scope of this PR. * restart ci; it appears to be stalled * use necessary import for no-std * use a more realistic distribution of voters and candidates This isn't ideal; more realistic numbers would be about twice these. However, either case generation or voting has nonlinear execution time, and doubling these values brings iteration time from ~20s to ~180s. Fuzzing 6x as fast should make up for fuzzing cases half the size. * identify specifically which PJR check may fail * move candidate collection comment into correct place * standard_threshold: use a calculation method which cannot overflow * Apply suggestions from code review (update comments) Co-authored-by: Kian Paimani <5588131+kianenigma@users.noreply.github.com> * clarify the effectiveness bounds for t-pjr check * how to spell "committee" * reorganize: high -> low abstraction * ensure standard threshold calc cannot panic Co-authored-by: Kian Paimani <5588131+kianenigma@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Shawn Tabrizi <shawntabrizi@gmail.com> Co-authored-by: kianenigma <kian.peymani@gmail.com> Co-authored-by: Kian Paimani <5588131+kianenigma@users.noreply.github.com> Co-authored-by: Shawn Tabrizi <shawntabrizi@gmail.com>
paritytech · Mar 11, 2021 · fcab5a3 · fcab5a3
1 parent b24c43a
commit fcab5a3
Show file tree

Hide file tree

Showing 8 changed files with 792 additions and 45 deletions.
diff --git a/Cargo.lock b/Cargo.lock
diff --git a/frame/staking/src/lib.rs b/frame/staking/src/lib.rs
@@ -2740,8 +2740,8 @@ impl<T: Config> Module<T> {
 		// write new results.
 		<QueuedElected<T>>::put(ElectionResult {
 			elected_stashes: winners,
-			compute,
 			exposures,
+			compute,
 		});
 		QueuedScore::put(submitted_score);
 

diff --git a/primitives/npos-elections/fuzzer/Cargo.toml b/primitives/npos-elections/fuzzer/Cargo.toml
@@ -14,12 +14,14 @@ publish = false
 targets = ["x86_64-unknown-linux-gnu"]
 
 [dependencies]
-sp-npos-elections = { version = "3.0.0", path = ".." }
-sp-std = { version = "3.0.0", path = "../../std" }
-sp-runtime = { version = "3.0.0", path = "../../runtime" }
+codec = { package = "parity-scale-codec", version = "2.0.0", default-features = false, features = ["derive"] }
 honggfuzz = "0.5"
 rand = { version = "0.7.3", features = ["std", "small_rng"] }
-codec = { package = "parity-scale-codec", version = "2.0.0", default-features = false, features = ["derive"] }
+sp-arithmetic = { version = "3.0.0", path = "../../arithmetic" }
+sp-npos-elections = { version = "3.0.0", path = ".." }
+sp-runtime = { version = "3.0.0", path = "../../runtime" }
+sp-std = { version = "3.0.0", path = "../../std" }
+structopt = "0.3.21"
 
 [[bin]]
 name = "reduce"
@@ -36,3 +38,7 @@ path = "src/phragmms_balancing.rs"
 [[bin]]
 name = "compact"
 path = "src/compact.rs"
+
+[[bin]]
+name = "phragmen_pjr"
+path = "src/phragmen_pjr.rs"
diff --git a/primitives/npos-elections/fuzzer/src/common.rs b/primitives/npos-elections/fuzzer/src/common.rs
@@ -20,10 +20,10 @@
 // Each function will be used based on which fuzzer binary is being used.
 #![allow(dead_code)]
 
-use sp_npos_elections::{ElectionResult, VoteWeight, phragmms, seq_phragmen};
-use sp_std::collections::btree_map::BTreeMap;
+use rand::{self, seq::SliceRandom, Rng, RngCore};
+use sp_npos_elections::{phragmms, seq_phragmen, ElectionResult, VoteWeight};
 use sp_runtime::Perbill;
-use rand::{self, Rng, RngCore};
+use std::collections::{BTreeMap, HashSet};
 
 /// converts x into the range [a, b] in a pseudo-fair way.
 pub fn to_range(x: usize, a: usize, b: usize) -> usize {
@@ -39,11 +39,81 @@ pub fn to_range(x: usize, a: usize, b: usize) -> usize {
 
 pub enum ElectionType {
 	Phragmen(Option<(usize, u128)>),
-	Phragmms(Option<(usize, u128)>)
+	Phragmms(Option<(usize, u128)>),
 }
 
 pub type AccountId = u64;
 
+/// Generate a set of inputs suitable for fuzzing an election algorithm
+///
+/// Given parameters governing how many candidates and voters should exist, generates a voting
+/// scenario suitable for fuzz-testing an election algorithm.
+///
+/// The returned candidate list is sorted. This sorting property should not affect the result of the
+/// calculation.
+///
+/// The returned voters list is sorted. This enables binary searching for a particular voter by
+/// account id. This sorting property should not affect the results of the calculation.
+///
+/// Each voter's selection of candidates to vote for is sorted.
+///
+/// Note that this does not generate balancing parameters.
+pub fn generate_random_npos_inputs(
+	candidate_count: usize,
+	voter_count: usize,
+	mut rng: impl Rng,
+) -> (
+	usize,
+	Vec<AccountId>,
+	Vec<(AccountId, VoteWeight, Vec<AccountId>)>,
+) {
+	// cache for fast generation of unique candidate and voter ids
+	let mut used_ids = HashSet::with_capacity(candidate_count + voter_count);
+
+	// always generate a sensible desired number of candidates: elections are uninteresting if we
+	// desire 0 candidates, or a number of candidates >= the actual number of candidates present
+	let rounds = rng.gen_range(1, candidate_count);
+
+	// candidates are easy: just a completely random set of IDs
+	let mut candidates: Vec<AccountId> = Vec::with_capacity(candidate_count);
+	for _ in 0..candidate_count {
+		let mut id = rng.gen();
+		// insert returns `false` when the value was already present
+		while !used_ids.insert(id) {
+			id = rng.gen();
+		}
+		candidates.push(id);
+	}
+	candidates.sort_unstable();
+	candidates.dedup();
+	assert_eq!(candidates.len(), candidate_count);
+
+	let mut voters = Vec::with_capacity(voter_count);
+	for _ in 0..voter_count {
+		let mut id = rng.gen();
+		// insert returns `false` when the value was already present
+		while !used_ids.insert(id) {
+			id = rng.gen();
+		}
+
+		let vote_weight = rng.gen();
+
+		// it's not interesting if a voter chooses 0 or all candidates, so rule those cases out.
+		let n_candidates_chosen = rng.gen_range(1, candidates.len());
+
+		let mut chosen_candidates = Vec::with_capacity(n_candidates_chosen);
+		chosen_candidates.extend(candidates.choose_multiple(&mut rng, n_candidates_chosen));
+		chosen_candidates.sort();
+		voters.push((id, vote_weight, chosen_candidates));
+	}
+
+	voters.sort_unstable();
+	voters.dedup_by_key(|(id, _weight, _chosen_candidates)| *id);
+	assert_eq!(voters.len(), voter_count);
+
+	(rounds, candidates, voters)
+}
+
 pub fn generate_random_npos_result(
 	voter_count: u64,
 	target_count: u64,
@@ -71,40 +141,41 @@ pub fn generate_random_npos_result(
 	});
 
 	let mut voters = Vec::with_capacity(voter_count as usize);
-	(prefix ..= (prefix + voter_count)).for_each(|acc| {
+	(prefix..=(prefix + voter_count)).for_each(|acc| {
 		let edge_per_this_voter = rng.gen_range(1, candidates.len());
 		// all possible targets
 		let mut all_targets = candidates.clone();
 		// we remove and pop into `targets` `edge_per_this_voter` times.
-		let targets = (0..edge_per_this_voter).map(|_| {
-			let upper = all_targets.len() - 1;
-			let idx = rng.gen_range(0, upper);
-			all_targets.remove(idx)
-		})
-		.collect::<Vec<AccountId>>();
-
-		let stake_var = rng.gen_range(ed, 100 * ed) ;
+		let targets = (0..edge_per_this_voter)
+			.map(|_| {
+				let upper = all_targets.len() - 1;
+				let idx = rng.gen_range(0, upper);
+				all_targets.remove(idx)
+			})
+			.collect::<Vec<AccountId>>();
+
+		let stake_var = rng.gen_range(ed, 100 * ed);
 		let stake = base_stake + stake_var;
 		stake_of.insert(acc, stake);
 		voters.push((acc, stake, targets));
 	});
 
 	(
 		match election_type {
-			ElectionType::Phragmen(conf) =>
-				seq_phragmen::<AccountId, sp_runtime::Perbill>(
-					to_elect,
-					candidates.clone(),
-					voters.clone(),
-					conf,
-				).unwrap(),
-			ElectionType::Phragmms(conf) =>
-				phragmms::<AccountId, sp_runtime::Perbill>(
-					to_elect,
-					candidates.clone(),
-					voters.clone(),
-					conf,
-				).unwrap(),
+			ElectionType::Phragmen(conf) => seq_phragmen::<AccountId, sp_runtime::Perbill>(
+				to_elect,
+				candidates.clone(),
+				voters.clone(),
+				conf,
+			)
+			.unwrap(),
+			ElectionType::Phragmms(conf) => phragmms::<AccountId, sp_runtime::Perbill>(
+				to_elect,
+				candidates.clone(),
+				voters.clone(),
+				conf,
+			)
+			.unwrap(),
 		},
 		candidates,
 		voters,

diff --git a/primitives/npos-elections/fuzzer/src/phragmen_pjr.rs b/primitives/npos-elections/fuzzer/src/phragmen_pjr.rs
@@ -0,0 +1,118 @@
+// This file is part of Substrate.
+
+// Copyright (C) 2020-2021 Parity Technologies (UK) Ltd.
+// SPDX-License-Identifier: Apache-2.0
+
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+// 	http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+//! Fuzzing which ensures that running unbalanced sequential phragmen always produces a result
+//! which satisfies our PJR checker.
+//!
+//! ## Running a single iteration
+//!
+//! Honggfuzz shuts down each individual loop iteration after a configurable time limit.
+//! It can be helpful to run a single iteration on your hardware to help benchmark how long that time
+//! limit should reasonably be. Simply run the program without the `fuzzing` configuration to run a
+//! single iteration: `cargo run --bin phragmen_pjr`.
+//!
+//! ## Running
+//!
+//! Run with `HFUZZ_RUN_ARGS="-t 10" cargo hfuzz run phragmen_pjr`.
+//!
+//! Note the environment variable: by default, `cargo hfuzz` shuts down each iteration after 1 second
+//! of runtime. We significantly increase that to ensure that the fuzzing gets a chance to complete.
+//! Running a single iteration can help determine an appropriate value for this parameter.
+//!
+//! ## Debugging a panic
+//!
+//! Once a panic is found, it can be debugged with
+//! `HFUZZ_RUN_ARGS="-t 10" cargo hfuzz run-debug phragmen_pjr hfuzz_workspace/phragmen_pjr/*.fuzz`.
+//!
+
+#[cfg(fuzzing)]
+use honggfuzz::fuzz;
+
+#[cfg(not(fuzzing))]
+use structopt::StructOpt;
+
+mod common;
+use common::{generate_random_npos_inputs, to_range};
+use rand::{self, SeedableRng};
+use sp_npos_elections::{pjr_check_core, seq_phragmen_core, setup_inputs, standard_threshold};
+
+type AccountId = u64;
+
+const MIN_CANDIDATES: usize = 250;
+const MAX_CANDIDATES: usize = 1000;
+const MIN_VOTERS: usize = 500;
+const MAX_VOTERS: usize = 2500;
+
+#[cfg(fuzzing)]
+fn main() {
+	loop {
+		fuzz!(|data: (usize, usize, u64)| {
+			let (candidate_count, voter_count, seed) = data;
+			iteration(candidate_count, voter_count, seed);
+		});
+	}
+}
+
+#[cfg(not(fuzzing))]
+#[derive(Debug, StructOpt)]
+struct Opt {
+	/// How many candidates participate in this election
+	#[structopt(short, long)]
+	candidates: Option<usize>,
+
+	/// How many voters participate in this election
+	#[structopt(short, long)]
+	voters: Option<usize>,
+
+	/// Random seed to use in this election
+	#[structopt(long)]
+	seed: Option<u64>,
+}
+
+#[cfg(not(fuzzing))]
+fn main() {
+	let opt = Opt::from_args();
+	// candidates and voters by default use the maxima, which turn out to be one less than
+	// the constant.
+	iteration(
+		opt.candidates.unwrap_or(MAX_CANDIDATES - 1),
+		opt.voters.unwrap_or(MAX_VOTERS - 1),
+		opt.seed.unwrap_or_default(),
+	);
+}
+
+fn iteration(mut candidate_count: usize, mut voter_count: usize, seed: u64) {
+	let rng = rand::rngs::SmallRng::seed_from_u64(seed);
+	candidate_count = to_range(candidate_count, MIN_CANDIDATES, MAX_CANDIDATES);
+	voter_count = to_range(voter_count, MIN_VOTERS, MAX_VOTERS);
+
+	let (rounds, candidates, voters) =
+		generate_random_npos_inputs(candidate_count, voter_count, rng);
+
+	let (candidates, voters) = setup_inputs(candidates, voters);
+
+	// Run seq-phragmen
+	let (candidates, voters) = seq_phragmen_core::<AccountId>(rounds, candidates, voters)
+		.expect("seq_phragmen must succeed");
+
+	let threshold = standard_threshold(rounds, voters.iter().map(|voter| voter.budget()));
+
+	assert!(
+		pjr_check_core(&candidates, &voters, threshold),
+		"unbalanced sequential phragmen must satisfy PJR",
+	);
+}