Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add infrastructure for writing unit tests on state machine components #1664

Closed
hdevalence opened this issue Nov 25, 2022 · 3 comments
Closed

Comments

@hdevalence
Copy link
Member

hdevalence commented Nov 25, 2022

Is your feature request related to a problem? Please describe.

We want to be able to write unit tests that check parts of the state machine in isolation. For instance, we'd like to be able to check individual consensus rules (does this consensus rule reject this input?), or to be able to check pre/post conditions for specific parts of the state machine (do we maintain the desired invariants after processing this input? do we write the data we expect?).

Currently, we don't have any meaningful scaffolding for writing these kinds of tests, which means that we rely solely on end-to-end integration tests. But while end-to-end integration tests are useful to have, they're more complicated and expensive to write and run, and shouldn't be the only option for writing tests. We should aim that the majority of the test suite is runnable by cargo test, so that most assurance can be accessed through existing Rust tooling. Not only is this a simpler and easier development workflow -- it only requires cargo, and not other tooling -- it can also run many tests in parallel, and faster.

As a prototypical example, consider #1631, which added a test to check the consensus rule that validator definitions with duplicate consensus keys are rejected. This is a good test to have, but we don't get any assurance benefit out of having it be a full, end-to-end network integration test that:

  • runs pcli in a subprocess
  • makes network requests to a running pd instance
  • parses the JSON output of pcli
  • edits a new validator definition
  • assembles it into a full, valid transaction
  • sends that transaction over the network to a running tendermint instance
  • has that transaction sent over ABCI to pd
  • has the mempool check every other consensus rule on the transaction
  • has the mempool check the desired consensus rule (if it didn't already reject it)
  • checks that the pcli subprocess that sent the transaction errored out

instead of just checking the specific consensus rule we care about directly. Not only is this process more expensive, if what we care about is checking that duplicate consensus keys are rejected, it's actually worse for assurance than a unit test, because we don't know that the reason the transaction was rejected was the one we're trying to test, rather than some other reason.

This isn't meant as a criticism of #1631, to be clear, and doing it as an integration test was my suggestion in #1210; the point is that at the moment, we don't have any other alternative structure to use to write this kind of test, but now that we've completed #1454, we can add one. This also isn't meant as a claim that we shouldn't have integration tests, either. Doing the full, end-to-end sequence of steps is important to ensure that none of the plumbing connecting the different parts of the system is broken. But the goal of the network integration tests should be to test the plumbing, not all of the nooks and crannies of the state machine.

So, how should we test the state machine?

First, let's consider how it's actually structured. The Penumbra application itself is driven by Tendermint via ABCI, and (should) have its state fully and completely determined only by the ABCI messages it receives. This has important implications for testing, because we can deterministically drive the state of the application in Rust test code just by sending a sequence of ABCI messages, without having to run Tendermint at all.

ABCI messages are sent to the penumbra_component::App, which handles InitChain, BeginBlock, and EndBlock messages by calling the methods on each component, and handles DeliverTx messages by calling the ActionHandler implementation for Transaction, which performs transaction-wide checks before dispatching to the ActionHandler implementation for each action in the transaction.

At this point, it's useful to distinguish between two types of state machine tests:

  • state machine integration tests, testing top-level behavior of all component logic working together;
  • state machine unit tests, testing individual parts of the state machine.

By passing a sequence of ABCI messages to the application, we can perform state machine integration tests (and we can potentially do so much more efficiently inside Rust test code than by having Tendermint generate and send them), but this doesn't help with the goal of being able to write state machine unit tests, as in the prototypical example in #1631. The reason is that the top-down processing will check all consensus rules together, not specific ones in isolation.

Because our state machine is written as a series of traits that compose additional behavior onto StateRead/StateWrite to ensure that all execution can be done transactionally, while we can exercise different parts of the state machine in isolation, we'll always be doing so on a State instance, as that's the only sense in which any part of the state machine can be instantiated.

One example of a state machine unit test in the existing codebase is in the IBC code's test_create_and_update_light_client. This test exercises the ICS2 client handshake implementation, by:

  • creates a tempdir to use as a backing store;
  • initializes a new Storage instance using it;
  • manually writes two pieces of data into the Storage that would be written as part of a full InitChain execution;
  • calls Ics2Client::init_chain to do the ICS2-specific state initialization;
  • constructs an invalid dummy transaction with IBC actions with handshake messages pulled from the cosmos hub;
  • pulls the dummy transaction apart to check and execute just those actions (to avoid other transaction checks);
  • checks that the execution succeeds.

Ideally, we'd be able to simplify this process by factoring out the setup code into common test code, with a helper method that:

  • creates a tempdir to use as a backing store;
  • initializes a new Storage instance using it;
  • performs full InitChain execution using default (or provided) genesis state;
  • hands the correctly-initialized test Storage to the caller.

We want to perform full initialization on the test storage, because our implementation assumes that the state was always correctly initialized, and we don't want our test code to violate that assumption.

Once we have a correctly-initialized test storage, we can use it for either state machine unit tests or state machine integration tests.

Describe the solution you'd like

To start, we should try to factor out the setup code from the ICS2 client unit test, and use that example as a way to check the ergonomics of the API.

@hdevalence
Copy link
Member Author

A further thought: why do we need to create a full Storage instance in a tempdir? Since all our code is written in terms of StateRead+StateWrite, we could have a TestState + TestStateTransaction backed by a BTreeMap for purely in-memory testing.

@hdevalence
Copy link
Member Author

xref #853 , which we created a while ago and then forgot about.

hdevalence added a commit that referenced this issue Nov 29, 2022
Work towards #1664

Closes #853

This commit cleans up the IBC client test as an example case of the new API.

Along the way, we also clean up some other ergonomic pain points, notably
adding an `Arc<State>::try_begin_transaction()` method.

Co-Authored-By: Conor Schaefer <conor@penumbra.zone>
hdevalence added a commit that referenced this issue Nov 29, 2022
Work towards #1664

Closes #853

This commit cleans up the IBC client test as an example case of the new API.

Along the way, we also clean up some other ergonomic pain points, notably
adding an `Arc<State>::try_begin_transaction()` method.

Co-Authored-By: "Conor Schaefer" <conor@penumbra.zone>
hdevalence added a commit that referenced this issue Nov 29, 2022
Work towards #1664

Closes #853

This commit cleans up the IBC client test as an example case of the new API.

Along the way, we also clean up some other ergonomic pain points, notably
adding an `Arc<State>::try_begin_transaction()` method.

Co-Authored-By: "Conor Schaefer" <conor@penumbra.zone>
@redshiftzero
Copy link
Member

This was done and two new tests were added in #1675, closing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Status: Testnet 38: Kalyke
Development

No branches or pull requests

2 participants