peer: send reestablish, shutdown messages before starting writeHandler #8186

Crypt-iQ · 2023-11-16T16:03:23Z

This is to avoid a race on WriteMessage and Flush internals. Because there is no locking on WriteMessage or Flush, if we allow writeMessage calls in Start after the writeHandler has started, the writeMessage calls may call WriteMessage/Flush at the same time that writeMessage calls from the writeHandler does. Since there is no locking, internals like b.nextHeaderSend can race and cause panics.

Fixes #8184

peer/brontide.go

morehouse

It would be great to have a test for this, to ensure we don't cause a similar breakage in the future.

If that's too difficult, second-best is probably some comments on writeMessage explaining that it should only be used by writeHandler.

peer/brontide.go

yyforyongyu

LGTM🙏 and +1 for detailed documentation.

This is to avoid a potential race on WriteMessage and Flush internals. Because there is no locking on WriteMessage and Flush, if we allow writeMessage calls in Start after the writeHandler has started, the writeMessage calls may call WriteMessage/Flush at the same time that writeMessage calls from the writeHandler does. Since there is no locking, internals like b.nextHeaderSend can race and cause panics.

morehouse

LGTM

Roasbeef · 2023-11-16T19:45:15Z

peer/brontide.go

@@ -703,6 +703,23 @@ func (p *Brontide) Start() error {

 	p.startTime = time.Now()

+	// Before launching the writeHandler goroutine, we send any channel


If we abstract out loadActiveChannels, or split it into: load chans and load messages, then we can write a unit test here to try to trigger the panic. IIUC, it relies on concurrent access to the brontide state machine, so the two goroutines need to line up directly. This may be easier to trigger with the -race flag on.

I have a unit test that detects the race. Will send a PR once I clean it up.

Roasbeef · 2023-11-16T20:58:00Z

Will make the release branch, merge this in, and add release notes. Goal here is a fast follow minor release with just this fix, if we can tighten up uni tests here than that's also ideal. Hopefully we're finally putting this saga to bed, and can devote resources to refactoring/re-writing this portion to get away from the mutex issues that started the saga in the first place.

hieblmi reviewed Nov 16, 2023

View reviewed changes

peer/brontide.go Outdated Show resolved Hide resolved

Crypt-iQ force-pushed the issue_8184_2 branch from 7986a59 to 94e5402 Compare November 16, 2023 16:35

Crypt-iQ marked this pull request as ready for review November 16, 2023 16:35

Crypt-iQ mentioned this pull request Nov 16, 2023

[bug]: v0.17.1-beta: panic: runtime error: slice bounds out of range [18:0] #8184

Closed

Crypt-iQ added the bug fix label Nov 16, 2023

Crypt-iQ requested review from Roasbeef, yyforyongyu and morehouse November 16, 2023 16:38

morehouse approved these changes Nov 16, 2023

View reviewed changes

peer/brontide.go Outdated Show resolved Hide resolved

yyforyongyu approved these changes Nov 16, 2023

View reviewed changes

Crypt-iQ force-pushed the issue_8184_2 branch from 94e5402 to 7556402 Compare November 16, 2023 17:08

morehouse approved these changes Nov 16, 2023

View reviewed changes

saubyk assigned Crypt-iQ Nov 16, 2023

saubyk added the brontide label Nov 16, 2023

yyforyongyu added the no-changelog label Nov 16, 2023

Roasbeef reviewed Nov 16, 2023

View reviewed changes

Roasbeef merged commit 64753b0 into lightningnetwork:master Nov 16, 2023
21 of 25 checks passed

Roasbeef mentioned this pull request Nov 16, 2023

release: create v0.17.2-beta release branch #8187

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

peer: send reestablish, shutdown messages before starting writeHandler #8186

peer: send reestablish, shutdown messages before starting writeHandler #8186

Crypt-iQ commented Nov 16, 2023 •

edited

Loading

morehouse left a comment

yyforyongyu left a comment

morehouse left a comment

Roasbeef Nov 16, 2023

morehouse Nov 16, 2023

morehouse Nov 16, 2023

Roasbeef commented Nov 16, 2023

		@@ -703,6 +703,23 @@ func (p *Brontide) Start() error {

		p.startTime = time.Now()

		// Before launching the writeHandler goroutine, we send any channel

peer: send reestablish, shutdown messages before starting writeHandler #8186

peer: send reestablish, shutdown messages before starting writeHandler #8186

Conversation

Crypt-iQ commented Nov 16, 2023 • edited Loading

morehouse left a comment

Choose a reason for hiding this comment

yyforyongyu left a comment

Choose a reason for hiding this comment

morehouse left a comment

Choose a reason for hiding this comment

Roasbeef Nov 16, 2023

Choose a reason for hiding this comment

morehouse Nov 16, 2023

Choose a reason for hiding this comment

morehouse Nov 16, 2023

Choose a reason for hiding this comment

Roasbeef commented Nov 16, 2023

Crypt-iQ commented Nov 16, 2023 •

edited

Loading