Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

routing+channeldb: mpp bucket structure #4000

Merged
merged 16 commits into from
Mar 10, 2020

Conversation

joostjager
Copy link
Contributor

@joostjager joostjager commented Feb 13, 2020

Update database structure to accommodate for multiple htlcs per payment. This is a preparation for mpp multi-shard sending (#3970 and #3967).

Additionally this pr adds a raw failure field to the payment htlc bucket. This is required to make SendToRoute reliable across restarts. That work will be done in a separate pr, but the db change itself is in this pr to prevent an additional migration later on. Also having the per-htlc failure reasons available helps with debugging issues in general.

Both changes are visible in the output of ListPayments. That will now show all the attempted htlcs including the failed ones and their failure reasons.

@joostjager joostjager force-pushed the payment-raw-failure branch 3 times, most recently from 2376822 to e375741 Compare February 19, 2020 16:40
@joostjager joostjager force-pushed the payment-raw-failure branch 3 times, most recently from 7a195eb to df57625 Compare February 20, 2020 09:14
@joostjager joostjager changed the title routing+channeldb: store htlc failure routing+channeldb: mpp bucket structure Feb 20, 2020
@joostjager joostjager force-pushed the payment-raw-failure branch 16 times, most recently from 21b708a to ff91bc0 Compare February 21, 2020 10:03
@joostjager joostjager marked this pull request as ready for review February 21, 2020 12:48
@joostjager joostjager removed the request for review from Roasbeef February 21, 2020 12:49
@joostjager
Copy link
Contributor Author

PR is ready for first pass. Migration commit needs to be squashed with db changes commit

@joostjager joostjager force-pushed the payment-raw-failure branch 2 times, most recently from c98c58d to a95bf0d Compare March 9, 2020 10:36
@joostjager
Copy link
Contributor Author

Removed the "0x"+ part from the migration. Replaced by migtest.Hex helper function

joostjager and others added 11 commits March 9, 2020 11:43
MPPaymentCreationInfo and PaymentCreationInfo are identical except for
the naming of CreationTime/CreationDate.
To better distinguish payments from HTLCs, we rename the attempt info
struct to HTLCAttemptInfo. We also embed it into the HTLCAttempt struct,
to avoid having to duplicate this information.

The paymentID term is renamed to attemptID.
Previously this was tested as a white box. Database access methods were
duplicated as test code and compared to the return value of the code
under test. This approaches leads to brittle test because it relies
heavily on implementation details. This commit changes this and prepares
for additional test coverage being added in later commits.
Duplicate payments is legacy that we keep alive for accounting purposes.
This commit isolates the deserialization logic for duplicate payments in
its own file, so that regular payment logic and db structure can evolve
without needing to handle/migrate the legacy data.
In a later commit, htlc raw failure messages will be exposed through the
main rpc. This is a preparation for that.
These functions will (indirectly) be called by the main rpc server and
can no longer stay conditionally compiled.
Preparation for marshalling wire errors as part of rpc payment lookups.
Preparation for when we need to return errors in a next commit.
Removes this unnecessary dependency allowing migration code to use
utility functions from channeldb/migtest.
@joostjager joostjager force-pushed the payment-raw-failure branch 2 times, most recently from 4428cac to 2dfb7dc Compare March 9, 2020 10:47
Copy link
Contributor

@halseth halseth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No major comments, PR looks solid 👍

channeldb/payment_control.go Outdated Show resolved Hide resolved

// Final sanity check to see if there are no in-flight htlcs.
for _, htlc := range payment.HTLCs {
if htlc.Settle == nil && htlc.Failure == nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If htlc.Settle != nil we will never allow failing the payment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't that correct? We have the preimage, so how can the payment be failed?

if err != nil {
return err
}
attemptInfo.AttemptID = aid
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cfromknecht Thoughts?

// Write failure. If there is no failure message, write an empty
// byte slice.
var messageBytes bytes.Buffer
if f.Message != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does the message have to come before the reason in the serialized code?

channeldb/migration13/migration.go Outdated Show resolved Hide resolved
timeBytes := newCreationInfo[32+8 : 32+8+8]
timeBytes[0] = 0
time := byteOrder.Uint64(timeBytes)
timeNs := time * 1000000000
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably easier to create a time struct from the bytes, then write out as UnixNano.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is that easier? Conversion from secs to nanosecs isn't more than this multiplication.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you commented on this because of the stray timeBytes[0]=0 statement?

joostjager and others added 5 commits March 9, 2020 18:31
This commit converts the database structure of a payment so that it can
not just store the last htlc attempt, but all attempts that have been
made. This is a preparation for mpp sending.

In addition to that, we now also persist the fail time of an htlc. In a
later commit, the full failure reason will be added as well.

A key change is made to the control tower interface. Previously the
control tower wasn't aware of individual htlc outcomes. The payment
remained in-flight with the latest attempt recorded, but an outcome was
only set when the payment finished. With this commit, the outcome of
every htlc is expected by the control tower and recorded in the
database.

Co-authored-by: Johan T. Halseth <johanth@gmail.com>
This commit extends the htlc fail info with the full failure reason that
was received over the wire. In a later commit, this info will also be
exposed on the rpc interface. Furthermore it serves as a building block
to make SendToRoute reliable across restarts.
This commit adds test helper code to dump, restore and verify the
low-level bbolt database structure.
This commit migrates the payments in the database to a new structure
that allows for multiple htlcs per payments. The migration introduces a
new sub-bucket that contains a list of htlcs and moves the old single
htlc into that.
Adds a new rpc field to the payment htlc proto message and populates it
with data that is now stored in the db.
Copy link
Contributor

@cfromknecht cfromknecht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM this looks ready to land ⚡️

@joostjager joostjager requested a review from halseth March 9, 2020 20:39
Copy link
Member

@Roasbeef Roasbeef left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🔰

Ran the migration on a few nodes (including those known to have duplicate payments), and the resulting diff in the ListPayments output is as expected (addition of the new fields).

Left some additional comments, but these are non-blocking and can be resolved in follow up PRs.

@@ -3098,3 +3098,163 @@ message BakeMacaroonResponse {
/// The hex encoded macaroon, serialized in binary format.
string macaroon = 1 [json_name = "macaroon"];
}


message Failure {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non-blocking, but I'd prefer to just copy over the protos compared to breaking users of the existing routerrpc package (some wallets now use this exclusively).

copy(newCreationInfo, creationInfo)

// Convert to nano seconds.
timeBytes := newCreationInfo[32+8 : 32+8+8]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non-blocking but this section could be made more readable by getting ride of the magic numbers and using types from the time package.

return nil, errors.New("unknown htlc failure reason")
}

return rpcFailure, nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There're a few fields that we appear to not properly be setting (might as well give all the information we have):

                    "failure": {
                        "code": "TEMPORARY_CHANNEL_FAILURE",
                        "channel_update": {
                            "signature": "2bc91eb05e6fe9e286e0e09372efdaafec4a62d224bdb7db7d23ea431722ac801ca39a9dd6827fe6d8913e806a1e6ed43ec7cae13a4e3bb8ffb40efbec6fdc11",
                            "chain_hash": "43497fd7f826957108f4a30fd9cec3aeba79972084e90ead01ea330900000000",
                            "chan_id": "1832711061194407937",
                            "timestamp": 1583766633,
                            "message_flags": 1,
                            "channel_flags": 1,
                            "time_lock_delta": 15,
                            "htlc_minimum_msat": "1000",
                            "base_fee": 0,
                            "fee_rate": 0,
                            "htlc_maximum_msat": "16609154000",
                            "extra_opaque_data": null
                        },
                        "htlc_msat": "0",
                        "onion_sha_256": null,
                        "cltv_expiry": 0,
                        "flags": 0,
                        "failure_source_index": 1,
                        "height": 0

@Roasbeef Roasbeef merged commit f29169e into lightningnetwork:master Mar 10, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants