PVF: Add test instructions (#2058)

paritytech · Nov 28, 2023 · dbd8d20 · dbd8d20
1 parent cd8741c
commit dbd8d20
Show file tree

Hide file tree

Showing 18 changed files with 146 additions and 60 deletions.
diff --git a/polkadot/doc/testing.md b/polkadot/doc/testing.md
@@ -1,34 +1,65 @@
 # Testing
 
-Automated testing is an essential tool to assure correctness.
+Testing is an essential tool to assure correctness. This document describes how we test the Polkadot code, whether
+locally, at scale, and/or automatically in CI.
 
 ## Scopes
 
 The testing strategy for Polkadot is 4-fold:
 
 ### Unit testing (1)
 
-Boring, small scale correctness tests of individual functions.
+Boring, small scale correctness tests of individual functions. It is usually
+enough to run `cargo test` in the crate you are testing.
+
+For full coverage you may have to pass some additional features. For example:
+
+```sh
+cargo test --features ci-only-tests
+```
 
 ### Integration tests
 
-There are two variants of integration tests:
+There are the following variants of integration tests:
 
 #### Subsystem tests (2)
 
 One particular subsystem (subsystem under test) interacts with a mocked overseer that is made to assert incoming and
-outgoing messages of the subsystem under test. This is largely present today, but has some fragmentation in the evolved
-integration test implementation. A `proc-macro`/`macro_rules` would allow for more consistent implementation and
-structure.
+outgoing messages of the subsystem under test. See e.g. the `statement-distribution` tests.
 
 #### Behavior tests (3)
 
-Launching small scale networks, with multiple adversarial nodes without any further tooling required. This should
-include tests around the thresholds in order to evaluate the error handling once certain assumed invariants fail.
+Launching small scale networks, with multiple adversarial nodes. This should include tests around the thresholds in
+order to evaluate the error handling once certain assumed invariants fail.
+
+Currently, we commonly use **zombienet** to run mini test-networks, whether locally or in CI. To run on your machine:
+
+- First, make sure you have [zombienet][zombienet] installed.
+
+- Now, all the required binaries must be installed in your $PATH. You must run the following from the `polkadot/`
+directory in order to test your changes. (Not `zombienet setup`, or you will get the released binaries without your
+local changes!)
 
-For this purpose based on `AllSubsystems` and `proc-macro` `AllSubsystemsGen`.
+```sh
+cargo install --path . --locked
+```
+
+- You will also need to install whatever binaries are required for your specific tests. For example, to install
+`undying-collator`, from `polkadot/`, run:
+
+```sh
+cargo install --path ./parachain/test-parachains/undying/collator --locked
+```
 
-This assumes a simplistic test runtime.
+- Finally, run the zombienet test from the `polkadot` directory:
+
+```sh
+RUST_LOG=parachain::pvf=trace zombienet --provider=native spawn zombienet_tests/functional/0001-parachains-pvf.toml
+```
+
+- You can pick a validator node like `alice` from the output and view its logs
+(`tail -f <log_file>`) or metrics. Make sure there is nothing funny in the logs
+(try `grep WARN <log_file>`).
 
 #### Testing at scale (4)
 
@@ -41,13 +72,27 @@ addition prometheus avoiding additional Polkadot source changes.
 _Behavior tests_ and _testing at scale_ have naturally soft boundary. The most significant difference is the presence of
 a real network and the number of nodes, since a single host often not capable to run multiple nodes at once.
 
----
+## Observing Logs
+
+To verify expected behavior it's often useful to observe logs. To avoid too many
+logs at once, you can run one test at a time:
+
+1. Add `sp_tracing::try_init_simple();` to the beginning of a test
+2. Specify `RUST_LOG=<target>::<subtarget>=trace` before the cargo command.
+
+For example:
+
+```sh
+RUST_LOG=parachain::pvf=trace cargo test execute_can_run_serially
+```
+
+For more info on how our logs work, check [the docs][logs].
 
 ## Coverage
 
 Coverage gives a _hint_ of the actually covered source lines by tests and test applications.
 
-The state of the art is currently [tarpaulin][tarpaulin] which unfortunately yields a lot of false negatives. Lines that
+The state of the art is currently tarpaulin which unfortunately yields a lot of false negatives. Lines that
 are in fact covered, marked as uncovered due to a mere linebreak in a statement can cause these artifacts. This leads to
 lower coverage percentages than there actually is.
 
@@ -102,7 +147,7 @@ Fuzzing is an approach to verify correctness against arbitrary or partially stru
 
 Currently implemented fuzzing targets:
 
-* `erasure-coding`
+- `erasure-coding`
 
 The tooling of choice here is `honggfuzz-rs` as it allows _fastest_ coverage according to "some paper" which is a
 positive feature when run as part of PRs.
@@ -113,16 +158,16 @@ hence simply not feasible due to the amount of state that is required.
 
 Other candidates to implement fuzzing are:
 
-* `rpc`
-* ...
+- `rpc`
+- ...
 
 ## Performance metrics
 
 There are various ways of performance metrics.
 
-* timing with `criterion`
-* cache hits/misses w/ `iai` harness or `criterion-perf`
-* `coz` a performance based compiler
+- timing with `criterion`
+- cache hits/misses w/ `iai` harness or `criterion-perf`
+- `coz` a performance based compiler
 
 Most of them are standard tools to aid in the creation of statistical tests regarding change in time of certain unit
 tests.
@@ -140,10 +185,10 @@ pursued at the current time.
 
 Requirements:
 
-* spawn nodes with preconfigured behaviors
-* allow multiple types of configuration to be specified
-* allow extendability via external crates
-* ...
+- spawn nodes with preconfigured behaviors
+- allow multiple types of configuration to be specified
+- allow extendability via external crates
+- ...
 
 ---
 
@@ -251,5 +296,7 @@ behavior_testcase!{
 }
 ```
 
+[zombienet]: https://github.com/paritytech/zombienet
 [Gurke]: https://github.com/paritytech/gurke
 [simnet]: https://github.com/paritytech/simnet_scripts
+[logs]: https://github.com/paritytech/polkadot-sdk/blob/master/polkadot/node/gum/src/lib.rs
diff --git a/polkadot/node/core/pvf/README.md b/polkadot/node/core/pvf/README.md
@@ -0,0 +1,47 @@
+# PVF Host
+
+This is the PVF host, responsible for responding to requests from Candidate
+Validation and spawning worker tasks to fulfill those requests.
+
+See also:
+
+- for more information: [the Implementer's Guide][impl-guide]
+- for an explanation of terminology: [the Glossary][glossary]
+
+## Running basic tests
+
+Running `cargo test` in the `pvf/` directory will run unit and integration
+tests.
+
+**Note:** some tests run only under Linux, amd64, and/or with the
+`ci-only-tests` feature enabled.
+
+See the general [Testing][testing] instructions for more information on
+**running tests** and **observing logs**.
+
+## Running a test-network with zombienet
+
+Since this crate is consensus-critical, for major changes it is highly
+recommended to run a test-network. See the "Behavior tests" section of the
+[Testing][testing] docs for full instructions.
+
+To run the PVF-specific zombienet test:
+
+```sh
+RUST_LOG=parachain::pvf=trace zombienet --provider=native spawn zombienet_tests/functional/0001-parachains-pvf.toml
+```
+
+## Testing on Linux
+
+Some of the PVF functionality, especially related to security, is Linux-only,
+and some is amd64-only. If you touch anything security-related, make sure to
+test on Linux amd64! If you're on a Mac, you can either run a VM or you can hire
+a VPS and use the open-source tool [EternalTerminal][et] to connect to it.[^et]
+
+[^et]: Unlike ssh, ET preserves your session across disconnects, and unlike
+another popular persistent shell, mosh, it allows scrollback.
+
+[impl-guide]: https://paritytech.github.io/polkadot-sdk/book/pvf-prechecking.html#summary
+[glossary]: https://paritytech.github.io/polkadot-sdk/book/glossary.html
+[testing]: https://github.com/paritytech/polkadot-sdk/blob/master/polkadot/doc/testing.md
+[et]: https://github.com/MisterTea/EternalTerminal
diff --git a/...node/core/pvf/common/src/executor_intf.rs → ...core/pvf/common/src/executor_interface.rs b/...node/core/pvf/common/src/executor_intf.rs → ...core/pvf/common/src/executor_interface.rs
@@ -175,11 +175,9 @@ pub fn params_to_wasmtime_semantics(par: &ExecutorParams) -> Semantics {
 
 /// Runs the prevalidation on the given code. Returns a [`RuntimeBlob`] if it succeeds.
 pub fn prevalidate(code: &[u8]) -> Result<RuntimeBlob, sc_executor_common::error::WasmError> {
+ // Construct the runtime blob and do some basic checks for consistency.
  let blob = RuntimeBlob::new(code)?;
- // It's assumed this function will take care of any prevalidation logic
- // that needs to be done.
- //
- // Do nothing for now.
+ // In the future this function should take care of any further prevalidation logic.
  Ok(blob)
 }
 

diff --git a/polkadot/node/core/pvf/common/src/lib.rs b/polkadot/node/core/pvf/common/src/lib.rs
@@ -18,7 +18,7 @@
 
 pub mod error;
 pub mod execute;
-pub mod executor_intf;
+pub mod executor_interface;
 pub mod prepare;
 pub mod pvf;
 pub mod worker;

diff --git a/polkadot/node/core/pvf/execute-worker/src/lib.rs b/polkadot/node/core/pvf/execute-worker/src/lib.rs
@@ -17,7 +17,7 @@
 //! Contains the logic for executing PVFs. Used by the polkadot-execute-worker binary.
 
 pub use polkadot_node_core_pvf_common::{
- executor_intf::execute_artifact, worker_dir, SecurityStatus,
+ executor_interface::execute_artifact, worker_dir, SecurityStatus,
 };
 
 // NOTE: Initializing logging in e.g. tests will not have an effect in the workers, as they are
@@ -236,7 +236,7 @@ fn validate_using_artifact(
  let descriptor_bytes = match unsafe {
  // SAFETY: this should be safe since the compiled artifact passed here comes from the
  // file created by the prepare workers. These files are obtained by calling
- // [`executor_intf::prepare`].
+ // [`executor_interface::prepare`].
  execute_artifact(compiled_artifact_blob, executor_params, params)
  } {
  Err(err) => return JobResponse::format_invalid("execute", &err),

diff --git a/polkadot/node/core/pvf/prepare-worker/benches/prepare_rococo_runtime.rs b/polkadot/node/core/pvf/prepare-worker/benches/prepare_rococo_runtime.rs
@@ -16,7 +16,7 @@
 
 use criterion::{criterion_group, criterion_main, Criterion, SamplingMode};
 use polkadot_node_core_pvf_common::{
- executor_intf::{prepare, prevalidate},
+ executor_interface::{prepare, prevalidate},
  prepare::PrepareJobKind,
  pvf::PvfPrepData,
 };

diff --git a/polkadot/node/core/pvf/prepare-worker/src/lib.rs b/polkadot/node/core/pvf/prepare-worker/src/lib.rs
@@ -18,7 +18,7 @@
 
 mod memory_stats;
 
-use polkadot_node_core_pvf_common::executor_intf::{prepare, prevalidate};
+use polkadot_node_core_pvf_common::executor_interface::{prepare, prevalidate};
 
 // NOTE: Initializing logging in e.g. tests will not have an effect in the workers, as they are
 // separate spawned processes. Run with e.g. `RUST_LOG=parachain::pvf-prepare-worker=trace`.
@@ -41,7 +41,7 @@ use os_pipe::{self, PipeReader, PipeWriter};
 use parity_scale_codec::{Decode, Encode};
 use polkadot_node_core_pvf_common::{
  error::{PrepareError, PrepareWorkerResult},
- executor_intf::create_runtime_from_artifact_bytes,
+ executor_interface::create_runtime_from_artifact_bytes,
  framed_recv_blocking, framed_send_blocking,
  prepare::{MemoryStats, PrepareJobKind, PrepareStats, PrepareWorkerSuccess},
  pvf::PvfPrepData,

diff --git a/polkadot/node/core/pvf/src/execute/mod.rs b/polkadot/node/core/pvf/src/execute/mod.rs
@@ -21,6 +21,6 @@
 //! `polkadot_node_core_pvf_worker::execute_worker_entrypoint`.
 
 mod queue;
-mod worker_intf;
+mod worker_interface;
 
 pub use queue::{start, PendingExecutionRequest, ToQueue};
diff --git a/polkadot/node/core/pvf/src/execute/queue.rs b/polkadot/node/core/pvf/src/execute/queue.rs
@@ -16,12 +16,12 @@
 
 //! A queue that handles requests for PVF execution.
 
-use super::worker_intf::Outcome;
+use super::worker_interface::Outcome;
 use crate::{
  artifacts::{ArtifactId, ArtifactPathId},
  host::ResultSender,
  metrics::Metrics,
- worker_intf::{IdleWorker, WorkerHandle},
+ worker_interface::{IdleWorker, WorkerHandle},
  InvalidCandidate, PossiblyInvalidError, ValidationError, LOG_TARGET,
 };
 use futures::{
@@ -448,7 +448,7 @@ async fn spawn_worker_task(
  use futures_timer::Delay;
 
  loop {
- match super::worker_intf::spawn(
+ match super::worker_interface::spawn(
  &program_path,
  &cache_path,
  job.executor_params.clone(),
@@ -500,7 +500,7 @@ fn assign(queue: &mut Queue, worker: Worker, job: ExecuteJob) {
  queue.mux.push(
  async move {
  let _timer = execution_timer;
- let outcome = super::worker_intf::start_work(
+ let outcome = super::worker_interface::start_work(
  idle,
  job.artifact.clone(),
  job.exec_timeout,

diff --git a/.../node/core/pvf/src/execute/worker_intf.rs → .../core/pvf/src/execute/worker_interface.rs b/.../node/core/pvf/src/execute/worker_intf.rs → .../core/pvf/src/execute/worker_interface.rs
@@ -18,7 +18,7 @@
 
 use crate::{
  artifacts::ArtifactPathId,
- worker_intf::{
+ worker_interface::{
  clear_worker_dir_path, framed_recv, framed_send, spawn_with_program_path, IdleWorker,
  SpawnErr, WorkerDir, WorkerHandle, JOB_TIMEOUT_WALL_CLOCK_FACTOR,
  },

diff --git a/polkadot/node/core/pvf/src/lib.rs b/polkadot/node/core/pvf/src/lib.rs
@@ -98,7 +98,7 @@ mod metrics;
 mod prepare;
 mod priority;
 mod security;
-mod worker_intf;
+mod worker_interface;
 
 #[cfg(feature = "test-utils")]
 pub mod testing;
@@ -107,7 +107,7 @@ pub use error::{InvalidCandidate, PossiblyInvalidError, ValidationError};
 pub use host::{start, Config, ValidationHost, EXECUTE_BINARY_NAME, PREPARE_BINARY_NAME};
 pub use metrics::Metrics;
 pub use priority::Priority;
-pub use worker_intf::{framed_recv, framed_send, JOB_TIMEOUT_WALL_CLOCK_FACTOR};
+pub use worker_interface::{framed_recv, framed_send, JOB_TIMEOUT_WALL_CLOCK_FACTOR};
 
 // Re-export some common types.
 pub use polkadot_node_core_pvf_common::{

diff --git a/polkadot/node/core/pvf/src/prepare/mod.rs b/polkadot/node/core/pvf/src/prepare/mod.rs
@@ -24,7 +24,7 @@
 
 mod pool;
 mod queue;
-mod worker_intf;
+mod worker_interface;
 
 pub use pool::start as start_pool;
 pub use queue::{start as start_queue, FromQueue, ToQueue};
diff --git a/polkadot/node/core/pvf/src/prepare/pool.rs b/polkadot/node/core/pvf/src/prepare/pool.rs
@@ -14,10 +14,10 @@
 // You should have received a copy of the GNU General Public License
 // along with Polkadot. If not, see <http://www.gnu.org/licenses/>.
 
-use super::worker_intf::{self, Outcome};
+use super::worker_interface::{self, Outcome};
 use crate::{
  metrics::Metrics,
- worker_intf::{IdleWorker, WorkerHandle},
+ worker_interface::{IdleWorker, WorkerHandle},
  LOG_TARGET,
 };
 use always_assert::never;
@@ -278,7 +278,7 @@ async fn spawn_worker_task(
  use futures_timer::Delay;
 
  loop {
- match worker_intf::spawn(
+ match worker_interface::spawn(
  &program_path,
  &cache_path,
  spawn_timeout,
@@ -306,7 +306,7 @@ async fn start_work_task<Timer>(
  cache_path: PathBuf,
  _preparation_timer: Option<Timer>,
 ) -> PoolEvent {
- let outcome = worker_intf::start_work(&metrics, idle, pvf, cache_path).await;
+ let outcome = worker_interface::start_work(&metrics, idle, pvf, cache_path).await;
  PoolEvent::StartWork(worker, outcome)
 }
 

diff --git a/.../node/core/pvf/src/prepare/worker_intf.rs → .../core/pvf/src/prepare/worker_interface.rs b/.../node/core/pvf/src/prepare/worker_intf.rs → .../core/pvf/src/prepare/worker_interface.rs
@@ -19,7 +19,7 @@
 use crate::{
  artifacts::ArtifactId,
  metrics::Metrics,
- worker_intf::{
+ worker_interface::{
  clear_worker_dir_path, framed_recv, framed_send, spawn_with_program_path, IdleWorker,
  SpawnErr, WorkerDir, WorkerHandle, JOB_TIMEOUT_WALL_CLOCK_FACTOR,
  },