-
Notifications
You must be signed in to change notification settings - Fork 316
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add infrastructure for writing unit tests on state machine components #1664
Comments
A further thought: why do we need to create a full Storage instance in a tempdir? Since all our code is written in terms of StateRead+StateWrite, we could have a TestState + TestStateTransaction backed by a BTreeMap for purely in-memory testing. |
xref #853 , which we created a while ago and then forgot about. |
This was done and two new tests were added in #1675, closing |
This was referenced Jan 30, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Is your feature request related to a problem? Please describe.
We want to be able to write unit tests that check parts of the state machine in isolation. For instance, we'd like to be able to check individual consensus rules (does this consensus rule reject this input?), or to be able to check pre/post conditions for specific parts of the state machine (do we maintain the desired invariants after processing this input? do we write the data we expect?).
Currently, we don't have any meaningful scaffolding for writing these kinds of tests, which means that we rely solely on end-to-end integration tests. But while end-to-end integration tests are useful to have, they're more complicated and expensive to write and run, and shouldn't be the only option for writing tests. We should aim that the majority of the test suite is runnable by
cargo test
, so that most assurance can be accessed through existing Rust tooling. Not only is this a simpler and easier development workflow -- it only requirescargo
, and not other tooling -- it can also run many tests in parallel, and faster.As a prototypical example, consider #1631, which added a test to check the consensus rule that validator definitions with duplicate consensus keys are rejected. This is a good test to have, but we don't get any assurance benefit out of having it be a full, end-to-end network integration test that:
pcli
in a subprocesspd
instancepcli
tendermint
instancepd
pcli
subprocess that sent the transaction errored outinstead of just checking the specific consensus rule we care about directly. Not only is this process more expensive, if what we care about is checking that duplicate consensus keys are rejected, it's actually worse for assurance than a unit test, because we don't know that the reason the transaction was rejected was the one we're trying to test, rather than some other reason.
This isn't meant as a criticism of #1631, to be clear, and doing it as an integration test was my suggestion in #1210; the point is that at the moment, we don't have any other alternative structure to use to write this kind of test, but now that we've completed #1454, we can add one. This also isn't meant as a claim that we shouldn't have integration tests, either. Doing the full, end-to-end sequence of steps is important to ensure that none of the plumbing connecting the different parts of the system is broken. But the goal of the network integration tests should be to test the plumbing, not all of the nooks and crannies of the state machine.
So, how should we test the state machine?
First, let's consider how it's actually structured. The Penumbra application itself is driven by Tendermint via ABCI, and (should) have its state fully and completely determined only by the ABCI messages it receives. This has important implications for testing, because we can deterministically drive the state of the application in Rust test code just by sending a sequence of ABCI messages, without having to run Tendermint at all.
ABCI messages are sent to the
penumbra_component::App
, which handlesInitChain
,BeginBlock
, andEndBlock
messages by calling the methods on each component, and handlesDeliverTx
messages by calling theActionHandler
implementation forTransaction
, which performs transaction-wide checks before dispatching to theActionHandler
implementation for each action in the transaction.At this point, it's useful to distinguish between two types of state machine tests:
By passing a sequence of ABCI messages to the application, we can perform state machine integration tests (and we can potentially do so much more efficiently inside Rust test code than by having Tendermint generate and send them), but this doesn't help with the goal of being able to write state machine unit tests, as in the prototypical example in #1631. The reason is that the top-down processing will check all consensus rules together, not specific ones in isolation.
Because our state machine is written as a series of traits that compose additional behavior onto
StateRead
/StateWrite
to ensure that all execution can be done transactionally, while we can exercise different parts of the state machine in isolation, we'll always be doing so on aState
instance, as that's the only sense in which any part of the state machine can be instantiated.One example of a state machine unit test in the existing codebase is in the IBC code's
test_create_and_update_light_client
. This test exercises the ICS2 client handshake implementation, by:Storage
instance using it;Storage
that would be written as part of a fullInitChain
execution;Ics2Client::init_chain
to do the ICS2-specific state initialization;Ideally, we'd be able to simplify this process by factoring out the setup code into common test code, with a helper method that:
Storage
instance using it;InitChain
execution using default (or provided) genesis state;Storage
to the caller.We want to perform full initialization on the test storage, because our implementation assumes that the state was always correctly initialized, and we don't want our test code to violate that assumption.
Once we have a correctly-initialized test storage, we can use it for either state machine unit tests or state machine integration tests.
Describe the solution you'd like
To start, we should try to factor out the setup code from the ICS2 client unit test, and use that example as a way to check the ergonomics of the API.
The text was updated successfully, but these errors were encountered: