Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

blockchain: Harden and optimize utxo cache. #2995

Merged
merged 19 commits into from
Sep 28, 2022

Conversation

davecgh
Copy link
Member

@davecgh davecgh commented Sep 13, 2022

This requires #2994.

This reworks several portions of the utxo cache to improve its robustness, optimize it, and correct some hard to hit corner cases that involve a mix of manual block invalidation, conditional flushing, and successive unclean shutdowns along with reworking several of the tests and significantly increasing the associated scenarios covered by the tests.

Note that since this is consensus critical code, significant effort has been put into making these changes easier to reason about and review by carefully crafting a series of individual commits such that every commit message thoroughly describes its purpose, maintains consensus, and therefore passes all tests.

The following is a high-level overview of the changes:

  • Miscellaneous consistency cleanup
  • Pre-allocates the map used to track in-flight transactions in utxo views
  • Removes exported utxo cache SpendEntry method
  • Remove exported utxo cache AddEntry method
  • Removes unused utxo cache add entry error
  • Removes the unnecessary tip parameter from the utxo cache Initialize method
  • Modifies the utxo cache tests to allow overriding cache flushing
  • Improves utxo cache initialize tests to ensure initialize code itself flushes the expected data to the backend
  • Corrects an extremely hard to hit corner case involving a mix of manual block invalidation, conditional flushing, and successive unclean shutdowns
    • Adds tests for this condition to ensure proper behavior
  • Modifies the code that deals with fetching input utxos to avoid needlessly looking up inputs for treasurybase and treasury spend transactions
  • Updates the code when connecting transactions in a utxo view to avoid adding treasurybase utxos since they are never directly spendable
  • Simplifies the various utxo cache tests and improves their legibility
  • Reworks utxo cache spend entry tests to simplify them and make them more flexible
  • Reworks utxo cache commit tests to simplify them, make them more flexible, and test a lot more conditions
  • Improves the robustness of Commit in regards to unmodified spent and unspent fresh view entries to help defend against code changes that could easily inadvertently violate the non-obvious, and heretofore undocumented, assumptions
  • Reworks the utxo cache add entry tests to simplify them and make them more flexible
  • Separates utxo cache and view state by updating the addEntry and fetchEntry methods to ensure the modified and spent flags are cleared on cloned entries loaded from the cache and and always set correctly when adding entries to the cache
  • Modifies the utxo cache commit logic to ensure more robust handling of spent outputs including checking the backend for entries not currently in the cache and conditionally loading them if needed
    • Includes the addition of a double spend assertion
    • Adds tests to ensure the new behavior works as intended
  • Splits the logic for handling the connection of transactions from the regular tree and stake tree to a utxo view to provide a clean path for optimizing zero confirmation spends
  • Updates the utxo view and cache handling to detect zero confirmation spends in the regular transaction tree and bypass the cache when committing the view to avoid cache misses given they are guaranteed to never have any cache entries since they never exist beyond the scope of the view

@davecgh davecgh added non-forking consensus Changes that involve modifying consensus code without causing any forking changes. optimization test coverage Discussion and pull requests for improving test coverage. labels Sep 13, 2022
internal/blockchain/utxoviewpoint.go Outdated Show resolved Hide resolved
internal/blockchain/utxocache.go Outdated Show resolved Hide resolved
internal/blockchain/utxoviewpoint.go Outdated Show resolved Hide resolved
internal/blockchain/utxoviewpoint.go Outdated Show resolved Hide resolved
@dnldd
Copy link
Member

dnldd commented Sep 13, 2022

Looks good overall, will be running a mainnet sync with this soon.

internal/blockchain/utxocache.go Outdated Show resolved Hide resolved
internal/blockchain/utxoviewpoint.go Outdated Show resolved Hide resolved
internal/blockchain/utxoviewpoint.go Outdated Show resolved Hide resolved
internal/blockchain/utxoviewpoint.go Show resolved Hide resolved
@davecgh davecgh force-pushed the blockchain_utxocache_coherence branch from a9ccf4c to 816699b Compare September 13, 2022 23:56
internal/blockchain/utxoentry.go Outdated Show resolved Hide resolved
internal/blockchain/utxoviewpoint.go Outdated Show resolved Hide resolved
@davecgh davecgh force-pushed the blockchain_utxocache_coherence branch 2 times, most recently from 8624e18 to fc851cc Compare September 14, 2022 03:40
@dnldd
Copy link
Member

dnldd commented Sep 14, 2022

Synced mainnet with the changes made here without issue.

internal/blockchain/utxocache.go Outdated Show resolved Hide resolved
internal/blockchain/utxocache.go Show resolved Hide resolved
internal/blockchain/utxoviewpoint.go Outdated Show resolved Hide resolved
This makes some minor and non-functional misc changes to improve code
consistency.
This modifies the code that tracks the in-flight transactions when
determining the input utxos that need to be loaded to pre-allocate the
map to avoid some additional overhead associated with growing the map
multiple times when a lot of transactions are involved.
This removes the exported SpendEntry method from the UtxoCache type to
help ensure cache coherency by forcing all external updates to the cache
to happen via Commit which the cache fully controls.
This removes the exported AddEntry method from the UtxoCache type to
help ensure cache coherency by forcing all external updates to the cache
to happen via Commit which the cache fully controls.
This removes the error return from the UtxoCache.addEntry method since
it does not return any errors.
This removes the tip parameter from the utxo cache Initialize method
since it is unneccesary given it takes the BlockChain as a paramter
which has the tip available on it.
This modifies the utxo cache tests to ensure the overridden MaybeFlush
method is called for all utxo cache methods as opposed to only when
accessed via the UtxoCacher interface.

Perhaps most notably, the overridden method is now called when the cache
is being initialized which allows its logic to be better tested.

It accomplishes this by adding a new closure to the UtxoCache struct
that defaults to the exported UtxoCache.MaybeAccept, updates all
internal calls to make use of it, and then overrides it when creating
the test harness.
Currently, the utxo cache initialize tests don't actually test that the
initialize code itself flushes the expected data to the backend rather
they do a flush from the tests themselves using the best chain tip which
may or may not match whatever initialize actually ordinarily
conditionally flushed.

This modifies the utxo cache initialize tests force flushing internally
when the cache is being initialized accordingly to better ensure it is
flushing the expected data.
@davecgh davecgh force-pushed the blockchain_utxocache_coherence branch from d7c1018 to a619295 Compare September 28, 2022 08:25
This corrects an extremely hard to hit corner case involving a mix of
manual block invalidation, conditional flushing, and successive unclean
shutdowns and adds a test to ensure proper behavior.

Specifically, the recovery code from an unclean shutdown that unwinds
the cache potentially periodically flushes the state to the backend was
incorrectly using the block being disconnected as opposed to its parent
which is now the best tip.

In practice, this is exceedingly difficult to hit without intentionally
trying because it requires a mix of highly unlikely events and manual
invalidation:

- Two unclean shutdowns at precise inopportune times
- Manual invalidation of blocks by the operator such that blocks are
  only disconnected without reconnecting a side chain
- The recovery taking long enough or involving enough entries to trigger
  a conditional flush, again, at exactly the right time prior to a
  second unclean shutdown
This modifies the code that deal with fetching input utxos to avoid
needlessly looking up inputs for treasurybase and treasury spend
transactions since they are both similar to coinbase transactions in
that they do not reference any real previous outputs and therefore are
guaranteed to result in a cache miss.
This updates the code when connecting transactions in a utxo view to
check for treasurybases prior to the coinbase since a treasurybase is
also identified as a coinbase, but unlike coinbases, treasurybase
outputs are never spendable and thus should not be added to the utxo
view.
This simplifies the various utxo cache tests and improves their
legibility by consolidating the utxo test entry state creation logic and
updating the tests to make use of it accordingly.

In particular, it introduces a struct to house entries with the various
states along with a method to create them.
This reworks the utxo cache spend entry tests to simplify them and make
them more flexible by moving the primary logic for all of the expected
results into the test definitions themselves as oppossed to special
casing depending on flags in the test body and also adds an additional
test case.

It also updates the test cache creation method to allow nil entries in
the initial cache.
This reworks the utxo cache commit tests to simplify them a bit, make
them more flexible, and test a lot more conditions.  It accomplishes
this by introducing an expected error so the tests themselves can
specify the expected result as opposed to assuming it should be nil.  It
also clones all of the view entries to ensure none of the shared entries
are modified when the cache takes ownership of them during the tests.

Finally, it improves the robustness of Commit by correcting the logic
regarding unmodified spent and unspent fresh view entries discovered
while adding the tests since it was technically incorrect even though
the relevant cases are never hit in practice in the current
implementation.  This helps defend against code changes that could
easily inadvertently violate the non-obvious, and heretofore
undocumented, assumptions that avoided hitting the incorrect behavior.
This reworks the utxo cache add entry tests to simplify them and make
them more flexible by moving the primary logic for all of the expected
results into the test definitions themselves as oppossed to special
casing depending on flags in the test body.
The code prior to this change clones entries from the cache directly
into a view which means any cache entries that are marked modified
and/or fresh end up in the view with the flags set.  This leads to
slightly less efficient code and several non-obvious assumptions that
are easy to violate when making updates to views or the cache.

For example, modified cache entries end up in the view marked as
modified which means that even if they aren't actually modified, once
the view is committed, those entries end up needlessly updating the
cache with the new entry that is actually the exact same as the existing
one.  Another example is that cache entries that are marked fresh have
to retain that flag in the view in order to ensure they remain fresh in
the cache once that entry is replaced per the previous point.

It is perhaps worth noting that, in practice, there is not typically a
lot extra work due to the modified entries since almost all utxos loaded
into a view from the cache are ultimately modified in some way which
necessarily marks them modified anyway.  However, it is still
technically incorrect for them to be marked modified before they are
actually modified and does lead to some extra work that is not
necessary.

To address the aforementioned, this modifies the utxo cache to ensure
the cache state and view states are separate by updating the addEntry
and fetchEntry methods to ensure the modified and spent flags are
cleared on cloned entries loaded from the cache and and always set
correctly when adding entries to the cache.

The result is slightly more efficient code and removal of potential
footguns due to non-obvious assumptions about the internals of the view
and cache logic.
This modifies the utxo cache commit logic to ensure more robust handling
of spent outputs including the addition of a double spend assertion and
adds tests to ensure the new behavior works as intended.

In particular, it updates the spendEntry method to attempt to load any
entries that are not already in the cache from backend which is
necessary since lack of an entry does not always guarantee the entry is
not in the backend in the case of a hard to hit corner case involving a
mix of reorganizations and unclean shutdowns.  It also adds a new double
spend assertion while here since it's good practice to be paranoid and
double check assumptions when it comes to potential double spends.

While it is certainly possible to tackle the specific aforementioned
case in other ways, this approach was chosen because it is an overall
robust solution that ensures proper behavior regardless of other
independent assumptions that might be introduced in future code updates.
This splits the logic for handling the connection of transactions from
the regular tree and stake tree to a utxo view by introducing individual
methods for them and updating the callers accordingly.

The primary motivation for this change is that the regular and stake
transaction trees do not have identical logic in terms of the ability to
spend outputs earlier in the block and it is more efficient to avoid
continuously checking for special conditions across all transactions
when only certain types can possibly be in each respective tree.

Splitting this logic also paves the way for future commits to take
advantage of the aforementioned spending semantics to optimize the utxo
cache.
This updates the utxo view and cache handling to detect zero
confirmation spends in the regular transaction tree (outputs that are
spent by the inputs of transactions later in the same tree) and bypass
the cache when committing the view.

This is desirable because such outputs never exist beyond the scope of
the view and therefore are guaranteed to never have any cache entries.
In other words, any attempts to spend them in cache will necessarily
result in a cache miss.

In order to implement the logic, a new state bit flag to track zero
confirmation spends is introduced along with code to appropriately set
and clear the flag when connecting and disconnection transactions.  The
bit flag is then used to bypass the cache spend when set.

Finally, cache commit tests are included to ensure proper functionality.
@davecgh davecgh force-pushed the blockchain_utxocache_coherence branch from a619295 to 8cfdcad Compare September 28, 2022 09:25
@davecgh davecgh merged commit 8cfdcad into decred:master Sep 28, 2022
@davecgh davecgh deleted the blockchain_utxocache_coherence branch September 28, 2022 17:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
non-forking consensus Changes that involve modifying consensus code without causing any forking changes. optimization test coverage Discussion and pull requests for improving test coverage.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants