Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kvdb wrapper for etcd #4015

Merged
merged 15 commits into from
May 22, 2020
Merged

kvdb wrapper for etcd #4015

merged 15 commits into from
May 22, 2020

Conversation

bhandras
Copy link
Collaborator

@bhandras bhandras commented Feb 18, 2020

This PR builds on top of channeldb/kvdb: introduce new KV-store database abstraction and adds a kvdb wrapper for etcd.

The PR is still somewhat WIP, but mostly ready for initial reviews.

Missing:

  • Batch
  • Inline comments from most places
  • Test coverage is ~90%
  • Configurable integration to LND/tests.
  • Handle errors in manual transactions
  • Migration tool to mirror bbolt dump to etcd.
  • A lot of additional testing.

@bhandras bhandras added boltdb incomplete WIP PR, not fully finalized, but light review possible database Related to the database/storage of LND and removed boltdb labels Feb 18, 2020
@Roasbeef Roasbeef added this to the 0.10.0 milestone Feb 26, 2020
Copy link
Member

@Roasbeef Roasbeef left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent work! Once this lands, lnd will take a large step towards a greater degree of robustness and reliability for more critical operations.

I've completed an initial pass through this draft PR, doing only a shallow pass over the unit tests as some of the behavior/API may change throughput the early review process. With that said though, upon an initial pass this PR appears to have rather solid unit test coverage! In the future we may want to extend the current default walletdb interface-level tests to encompass the scenarios covered in this PR.

After my initial pass, I have (so far) one core question:

  • How will errors be surfaced to the caller? By errors here, I mean terminal errors, so us running into errors that mean we _cannot) proceed at all. An example of such an error would be something like etcd crashing mid transaction.
    • Along the way to settling on a definite design/policy for this aspect of the interface, we may want to re-examine the way we handle database errors globally within lnd.
    • As is right now, in a few areas, we'll actually proceed with normal operations of the database if a persistence operation fails. This means that we may risk entering a dangerous or invalid state if say the disk fills up as we go to commit a new invoice.
    • Ideally, we should halt all operations if we encounters a persistence failure. By halt here I mean enter into a clean shutdown procedure, and fail to start if the persistence error persists (pun intended ;)). We can possibly implement this via a panic at the db-call site, with a recover at the top-level goroutine.

One aspect of this PR I had trouble grokking was the logic implementing the various range scan operations. It's likely the case that I need to revisit the etcd client documentation to properly understand the defined API and various options.

channeldb/kvdb/etcd/db.go Outdated Show resolved Hide resolved
channeldb/kvdb/etcd/fixture_test.go Show resolved Hide resolved
channeldb/kvdb/etcd/fixture_test.go Outdated Show resolved Hide resolved
channeldb/kvdb/etcd/stm.go Outdated Show resolved Hide resolved
channeldb/kvdb/etcd/stm.go Outdated Show resolved Hide resolved
channeldb/kvdb/etcd/bucket.go Outdated Show resolved Hide resolved
channeldb/kvdb/etcd/db.go Outdated Show resolved Hide resolved

kv := b.tx.stm.First(prefix)
for kv != nil {
if err := cb([]byte(kv.key[prefixLen:]), []byte(kv.val)); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bounds check key before accessing prefixLen, both here and below?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure we need prefix check here? stm.First will either return nil or a match in which case it contains prefix.

channeldb/kvdb/etcd/readwrite_bucket.go Outdated Show resolved Hide resolved
channeldb/kvdb/etcd/readwrite_cursor.go Show resolved Hide resolved
Copy link
Contributor

@cfromknecht cfromknecht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bhandras completed an initial pass!

my main concerns right now are about the path/key format. iiuc, i have the following observations:

  • storing binary data directly can collide with the control characters. this means there is some overhead in storage, e.g. 2x for hex, 4/3x for base64
  • the path prefix is serialized for every leaf value, which incurs a blow up esp. for highly nested data since we store path data linear in the number of leaves.

i don't have a great solution for these issues, perhaps this is what you were referring when saying it is awkward to implement the kvdb interface using etcd? it would be interesting to take a real-world lnd database and compute how much space it would occupy using this format and how much overhead we are grappling with.

// First returns the first k/v that begins with prefix or nil if there's
// no such k/v pair. If the key is found it is inserted to the txn's
// read set.
First(prefix string) *KV
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reason the interface operates on strings and not byte slices?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be nice to use byte slices but we either need to convert everything to string due to etcd's interface or etcd does this anyway. One other thing that makes this hard to implement nicely is that we use two maps for storing gets/puts before commit and we need string keys for them and also need to alloc/copy anyway (both for inserting and retrieval where user gets a value).
I believe for these reasons staying with good old immutable strings is a better choice (for now). Open to suggestions if you have a better approach.

channeldb/kvdb/etcd/stm.go Show resolved Hide resolved
channeldb/kvdb/etcd/db.go Outdated Show resolved Hide resolved
channeldb/kvdb/etcd/db.go Outdated Show resolved Hide resolved
channeldb/kvdb/etcd/db.go Outdated Show resolved Hide resolved
channeldb/kvdb/etcd/bucket.go Outdated Show resolved Hide resolved
channeldb/kvdb/etcd/readwrite_bucket.go Outdated Show resolved Hide resolved
channeldb/kvdb/etcd/readwrite_bucket.go Outdated Show resolved Hide resolved
return nil
}

return []byte(val)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similarly we may have to decode hex/base64 back into binary if we needed to convert it when storing.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I understand what you meant here. If you referred to binary keys, that's done.

@Roasbeef Roasbeef modified the milestones: 0.10.0, 0.11.0 Mar 17, 2020
@Roasbeef
Copy link
Member

Can be rebased now that the kvdb PR has been merged.

@bhandras bhandras force-pushed the kv-etcd branch 6 times, most recently from 79d3cb6 to 03e8d72 Compare March 25, 2020 16:59
@bhandras bhandras force-pushed the kv-etcd branch 10 times, most recently from a0ea0c6 to a92b90a Compare March 31, 2020 20:41
Copy link
Member

@Roasbeef Roasbeef left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is getting much closer! Completed a new pass with the latest design of components like buckets and the rest. I'll be switching over to an active testing phase now, where I may modify the last commit slightly to be closer to how we intend things to be run (graph stored locally as an example). This modified structure should work for now (for a new node), but we'll need to make sure the eventual migration tool is able to handle the switch over.

channeldb/kvdb/etcd/embed.go Show resolved Hide resolved
out <- err
}()

return <-out
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, this will likely block a daemon shutdown until the transaction commits or it runs into a fatal error, which I exactly what we want I think.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: in testing i was unable to shut down the daemon without kill -9 due to the large read requests. we can follow up with passing cancel contexts into the queries so we can break out if needed

channeldb/kvdb/etcd/stm.go Outdated Show resolved Hide resolved
channeldb/kvdb/etcd/stm.go Outdated Show resolved Hide resolved
channeldb/kvdb/etcd/stm.go Outdated Show resolved Hide resolved
channeldb/kvdb/etcd/bucket.go Show resolved Hide resolved
channeldb/kvdb/etcd/db.go Show resolved Hide resolved
channeldb/channel_test.go Show resolved Hide resolved
channeldb/kvdb/backend.go Outdated Show resolved Hide resolved
lncfg/db.go Show resolved Hide resolved
Copy link
Contributor

@cfromknecht cfromknecht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great idea and we should do it, but imho this is perhaps a bit out of scope for this PR as to achieve this we need to refactor the init order to be able to derive the private key before the channeldb is constructed.

Ah yes, good point. Indeed sounds like something for a follow up.

In that case, I think we should proceed! LGTM 💰

Copy link
Member

@Roasbeef Roasbeef left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🌊

Needs a rebase! Excited for what comes after this 😈

This commit adds an extended STM, similar to what available in etcd's
clientv3 module. This incarnation of said STM supports additional
features, like positioning in key intervals while taking into account
deletes and writes as well. This is a preliminary work to support all
features of the kvdb interface.
This commit adds a full interface implementation of the walletdb/kvdb
interface with detailed tests.
This commit adds walletdb interface test suite to the existing test set.
The commit itslef adds stats to the transactions such that we can see
how LND behaves with and etcd backend.
This commit extends lncfg to support user specified database backend.
This supports configuration for both bolt and etcd (while only allowing
one or the other).
This commit adds support for user configured channeldb backend.
This commit reduces the compare set size the STM will submit in
transactions by adding only the bucket keys along the bucket path to a
specific lock set. This lock set then used to filter the read set,
effectively removing all read only keys from the transaction predicate
that are not bucket keys.
By tracking if a read-write tx actually changes something, we can also
"bump" the mod revision of the bucket keys.
With this trick we essentially implement a read-write lock for our
bucket structure greatly reducing transaction processing time.
This commit adds the ExtendedBackend interface which is an extension to
the walletdb.DB interface. This paves the way to using etcd.db.View and
etcd.db.Update in the global View and Update functions without much code
rewrite.
This commit extends lncfg with etcd TLS config and passes these
parameters to the etcd client upon construction.
This commit separates all etcd related sources (sans a few stubs and
config) from the rest of the source tree and makes compilation conditional
depending on whether the kvdb_etcd build tag is specified.
This commit extends etcd db with namespaces without additional storage
space requirements. This is simply done by instead of using an all zero
root bucket id, we use the sha256 hash of the name space as our root
bucket id.
@bhandras bhandras merged commit 7f1a450 into lightningnetwork:master May 22, 2020
@githorray
Copy link
Contributor

Is the db time_to_open reported by LND under-reported after this merge?

[INF] LTND: Opening the main database, this might take a few minutes...
[INF] CHDB: Checking for schema update: latest_version=13, db_version=13
[INF] LTND: Database now open (time_to_open=19.680575ms)!

I'm used to a few seconds, but not milliseconds, especially after a reboot.

Before merge:

[INF] LTND: Database now open (time_to_open=14.952646033s)!
[INF] LTND: Database now open (time_to_open=19.449286406s)!
[INF] LTND: Database now open (time_to_open=14.656393946s)!
[INF] LTND: Database now open (time_to_open=15.319765911s)!
[INF] LTND: Database now open (time_to_open=14.79603768s)!
[INF] LTND: Database now open (time_to_open=14.659297753s)!

After merge:

[INF] LTND: Database now open (time_to_open=6.325834ms)!
[INF] LTND: Database now open (time_to_open=19.680575ms)!

Roasbeef added a commit to Roasbeef/lnd that referenced this pull request May 22, 2020
In this commit, we fix a regression in our DB open time logging that was
introduced in lightningnetwork#4015. Obtaining the target backend from the configuration
will actually also open the database, so we need to include that in the time
delta as well.
@Roasbeef
Copy link
Member

@githorray nice catch, should be fixed in this PR: #4309

matheusd pushed a commit to matheusd/dcrlnd that referenced this pull request Oct 12, 2020
In this commit, we fix a regression in our DB open time logging that was
introduced in lightningnetwork#4015. Obtaining the target backend from the configuration
will actually also open the database, so we need to include that in the time
delta as well.
@bhandras bhandras deleted the kv-etcd branch September 12, 2023 15:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
database Related to the database/storage of LND incomplete WIP PR, not fully finalized, but light review possible
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants