Add infrastructure for writing unit tests on state machine components #1664

hdevalence · 2022-11-25T21:26:24Z

Is your feature request related to a problem? Please describe.

We want to be able to write unit tests that check parts of the state machine in isolation. For instance, we'd like to be able to check individual consensus rules (does this consensus rule reject this input?), or to be able to check pre/post conditions for specific parts of the state machine (do we maintain the desired invariants after processing this input? do we write the data we expect?).

Currently, we don't have any meaningful scaffolding for writing these kinds of tests, which means that we rely solely on end-to-end integration tests. But while end-to-end integration tests are useful to have, they're more complicated and expensive to write and run, and shouldn't be the only option for writing tests. We should aim that the majority of the test suite is runnable by cargo test, so that most assurance can be accessed through existing Rust tooling. Not only is this a simpler and easier development workflow -- it only requires cargo, and not other tooling -- it can also run many tests in parallel, and faster.

As a prototypical example, consider #1631, which added a test to check the consensus rule that validator definitions with duplicate consensus keys are rejected. This is a good test to have, but we don't get any assurance benefit out of having it be a full, end-to-end network integration test that:

runs pcli in a subprocess
makes network requests to a running pd instance
parses the JSON output of pcli
edits a new validator definition
assembles it into a full, valid transaction
sends that transaction over the network to a running tendermint instance
has that transaction sent over ABCI to pd
has the mempool check every other consensus rule on the transaction
has the mempool check the desired consensus rule (if it didn't already reject it)
checks that the pcli subprocess that sent the transaction errored out

instead of just checking the specific consensus rule we care about directly. Not only is this process more expensive, if what we care about is checking that duplicate consensus keys are rejected, it's actually worse for assurance than a unit test, because we don't know that the reason the transaction was rejected was the one we're trying to test, rather than some other reason.

This isn't meant as a criticism of #1631, to be clear, and doing it as an integration test was my suggestion in #1210; the point is that at the moment, we don't have any other alternative structure to use to write this kind of test, but now that we've completed #1454, we can add one. This also isn't meant as a claim that we shouldn't have integration tests, either. Doing the full, end-to-end sequence of steps is important to ensure that none of the plumbing connecting the different parts of the system is broken. But the goal of the network integration tests should be to test the plumbing, not all of the nooks and crannies of the state machine.

So, how should we test the state machine?

First, let's consider how it's actually structured. The Penumbra application itself is driven by Tendermint via ABCI, and (should) have its state fully and completely determined only by the ABCI messages it receives. This has important implications for testing, because we can deterministically drive the state of the application in Rust test code just by sending a sequence of ABCI messages, without having to run Tendermint at all.

ABCI messages are sent to the penumbra_component::App, which handles InitChain, BeginBlock, and EndBlock messages by calling the methods on each component, and handles DeliverTx messages by calling the ActionHandler implementation for Transaction, which performs transaction-wide checks before dispatching to the ActionHandler implementation for each action in the transaction.

At this point, it's useful to distinguish between two types of state machine tests:

state machine integration tests, testing top-level behavior of all component logic working together;
state machine unit tests, testing individual parts of the state machine.

By passing a sequence of ABCI messages to the application, we can perform state machine integration tests (and we can potentially do so much more efficiently inside Rust test code than by having Tendermint generate and send them), but this doesn't help with the goal of being able to write state machine unit tests, as in the prototypical example in #1631. The reason is that the top-down processing will check all consensus rules together, not specific ones in isolation.

Because our state machine is written as a series of traits that compose additional behavior onto StateRead/StateWrite to ensure that all execution can be done transactionally, while we can exercise different parts of the state machine in isolation, we'll always be doing so on a State instance, as that's the only sense in which any part of the state machine can be instantiated.

One example of a state machine unit test in the existing codebase is in the IBC code's test_create_and_update_light_client. This test exercises the ICS2 client handshake implementation, by:

creates a tempdir to use as a backing store;
initializes a new Storage instance using it;
manually writes two pieces of data into the Storage that would be written as part of a full InitChain execution;
calls Ics2Client::init_chain to do the ICS2-specific state initialization;
constructs an invalid dummy transaction with IBC actions with handshake messages pulled from the cosmos hub;
pulls the dummy transaction apart to check and execute just those actions (to avoid other transaction checks);
checks that the execution succeeds.

Ideally, we'd be able to simplify this process by factoring out the setup code into common test code, with a helper method that:

creates a tempdir to use as a backing store;
initializes a new Storage instance using it;
performs full InitChain execution using default (or provided) genesis state;
hands the correctly-initialized test Storage to the caller.

We want to perform full initialization on the test storage, because our implementation assumes that the state was always correctly initialized, and we don't want our test code to violate that assumption.

Once we have a correctly-initialized test storage, we can use it for either state machine unit tests or state machine integration tests.

Describe the solution you'd like

To start, we should try to factor out the setup code from the ICS2 client unit test, and use that example as a way to check the ergonomics of the API.

The text was updated successfully, but these errors were encountered:

hdevalence · 2022-11-27T19:32:01Z

A further thought: why do we need to create a full Storage instance in a tempdir? Since all our code is written in terms of StateRead+StateWrite, we could have a TestState + TestStateTransaction backed by a BTreeMap for purely in-memory testing.

hdevalence · 2022-11-28T19:59:27Z

xref #853 , which we created a while ago and then forgot about.

Work towards #1664 Closes #853 This commit cleans up the IBC client test as an example case of the new API. Along the way, we also clean up some other ergonomic pain points, notably adding an `Arc<State>::try_begin_transaction()` method. Co-Authored-By: Conor Schaefer <conor@penumbra.zone>

Work towards #1664 Closes #853 This commit cleans up the IBC client test as an example case of the new API. Along the way, we also clean up some other ergonomic pain points, notably adding an `Arc<State>::try_begin_transaction()` method. Co-Authored-By: "Conor Schaefer" <conor@penumbra.zone>

redshiftzero · 2022-12-09T17:21:52Z

This was done and two new tests were added in #1675, closing

hdevalence mentioned this issue Nov 28, 2022

Create a Storage::ephemeral_test_instance() method #853

Closed

hdevalence mentioned this issue Nov 28, 2022

Component-based test structuring #591

Closed

hdevalence mentioned this issue Nov 29, 2022

component: better test infrastructure #1675

Merged

hdevalence added this to Testnets Dec 2, 2022

conorsch mentioned this issue Dec 2, 2022

Add tombstoning to smoke test #1209

Closed

redshiftzero closed this as completed Dec 9, 2022

redshiftzero moved this to Testnet 32: Chaldene in Testnets Dec 9, 2022

redshiftzero moved this from Testnet 32: Chaldene to Testnet 38: Kalyke in Testnets Dec 9, 2022

This was referenced Jan 30, 2024

Tracking issue: 🤝 mock consensus engine for App tests #3588

Closed

app: 🥞 dynamic application Components #3740

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add infrastructure for writing unit tests on state machine components #1664

Add infrastructure for writing unit tests on state machine components #1664

hdevalence commented Nov 25, 2022 •

edited

Loading

hdevalence commented Nov 27, 2022

hdevalence commented Nov 28, 2022

redshiftzero commented Dec 9, 2022

Add infrastructure for writing unit tests on state machine components #1664

Add infrastructure for writing unit tests on state machine components #1664

Comments

hdevalence commented Nov 25, 2022 • edited Loading

hdevalence commented Nov 27, 2022

hdevalence commented Nov 28, 2022

redshiftzero commented Dec 9, 2022

hdevalence commented Nov 25, 2022 •

edited

Loading