Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(dot/network): fix discovery between gossamer nodes #1594

Merged
merged 45 commits into from
May 20, 2021
Merged
Show file tree
Hide file tree
Changes from 42 commits
Commits
Show all changes
45 commits
Select commit Hold shift + click to select a range
5a38145
update README.md
noot May 13, 2021
117ebb8
update docs
noot May 13, 2021
c834e99
update gssmr genesis file to have 6 auths
noot May 13, 2021
2fca3d8
Merge branch 'development' into noot/update-readme
noot May 13, 2021
70b794e
Merge branch 'development' of github.com:ChainSafe/gossamer into noot…
noot May 13, 2021
b5ba601
add peercount to network logs
noot May 13, 2021
02faa96
update log order
noot May 13, 2021
09e8ca9
Merge branch 'development' into noot/update-readme
dutterbutter May 14, 2021
2943b15
Merge branch 'noot/devnet' of github.com:ChainSafe/gossamer into noot…
noot May 14, 2021
5d09224
update logs
noot May 14, 2021
d6e13f8
add log
noot May 14, 2021
332ff24
change DHT mode to ModeServer, add peersToTry map
noot May 14, 2021
c2175f5
lint
noot May 14, 2021
4f35500
fix loop logic in beginDiscovery
noot May 14, 2021
e0dce6a
add advertising logic
noot May 14, 2021
6246ce1
lint
noot May 14, 2021
65a021b
increase intial advertise ttl
noot May 14, 2021
e4417e4
bootstrap dht before advertise
noot May 14, 2021
f444e14
fix
noot May 14, 2021
d78eaa8
add discovery submodule; wait for peers before starting DHT
noot May 14, 2021
f67be8a
add logs
noot May 14, 2021
2578344
add package to get public IP
noot May 14, 2021
bc808a5
add routing table refresh
noot May 14, 2021
2233927
remove routing table refresh
noot May 14, 2021
1b87f0d
fix some tests
noot May 14, 2021
0525703
remove advertisement
noot May 17, 2021
63709f6
Merge branch 'development' of github.com:ChainSafe/gossamer into noot…
noot May 18, 2021
1f91b09
readd advertisement
noot May 18, 2021
acb5df1
cleanup
noot May 18, 2021
0960e47
revert genesis files
noot May 18, 2021
d984743
re-enable test
noot May 18, 2021
d3e47c0
update log levels
noot May 18, 2021
46bdf5c
decrease time before advertising
noot May 18, 2021
453660b
restore bootstrap order
noot May 18, 2021
47899b2
decrease failed to advertise ttl
noot May 18, 2021
62d9f28
cleanup, add constants
noot May 18, 2021
92bac0c
Merge branch 'development' of github.com:ChainSafe/gossamer into noot…
noot May 18, 2021
9dc4efb
log cleanup
noot May 18, 2021
d86369b
only append externalAddr to listen addrs if not nil
noot May 19, 2021
7824614
Merge branch 'development' of github.com:ChainSafe/gossamer into noot…
noot May 19, 2021
8da41c1
remove comment
noot May 19, 2021
8276fff
Merge branch 'development' into noot/fix-discovery
noot May 20, 2021
06d1965
Merge branch 'development' of github.com:ChainSafe/gossamer into noot…
noot May 20, 2021
ab89d69
address comments
noot May 20, 2021
d1cd6de
Merge branch 'noot/fix-discovery' of github.com:ChainSafe/gossamer in…
noot May 20, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
203 changes: 203 additions & 0 deletions dot/network/discovery.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,203 @@
// Copyright 2019 ChainSafe Systems (ON) Corp.
// This file is part of gossamer.
//
// The gossamer library is free software: you can redistribute it and/or modify
// it under the terms of the GNU Lesser General Public License as published by
// the Free Software Foundation, either version 3 of the License, or
// (at your option) any later version.
//
// The gossamer library is distributed in the hope that it will be useful,
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
// GNU Lesser General Public License for more details.
//
// You should have received a copy of the GNU Lesser General Public License
// along with the gossamer library. If not, see <http://www.gnu.org/licenses/>.

package network

import (
"context"
"fmt"
"time"

badger "github.com/ipfs/go-ds-badger2"
libp2phost "github.com/libp2p/go-libp2p-core/host"
"github.com/libp2p/go-libp2p-core/peer"
"github.com/libp2p/go-libp2p-core/peerstore"
"github.com/libp2p/go-libp2p-core/protocol"
libp2pdiscovery "github.com/libp2p/go-libp2p-discovery"
kaddht "github.com/libp2p/go-libp2p-kad-dht"
"github.com/libp2p/go-libp2p-kad-dht/dual"
)

var (
startDHTTimeout = time.Second * 10
initialAdvertisementTimeout = time.Millisecond
tryAdvertiseTimeout = time.Second * 30
connectToPeersTimeout = time.Minute
)

// discovery handles discovery of new peers via the kademlia DHT
type discovery struct {
ctx context.Context
dht *dual.DHT
h libp2phost.Host
bootnodes []peer.AddrInfo
ds *badger.Datastore
pid protocol.ID
minPeers, maxPeers int
}

func newDiscovery(ctx context.Context, h libp2phost.Host, bootnodes []peer.AddrInfo, ds *badger.Datastore, pid protocol.ID, min, max int) *discovery {
return &discovery{
ctx: ctx,
h: h,
bootnodes: bootnodes,
ds: ds,
pid: pid,
minPeers: min,
maxPeers: max,
}
}

// start creates the DHT.
func (d *discovery) start() error {
if len(d.bootnodes) == 0 {
// get all currently connected peers and use them to bootstrap the DHT
peers := d.h.Network().Peers()

for {
if len(peers) > 0 {
break
}

select {
case <-time.After(startDHTTimeout):
logger.Debug("no peers yet, waiting to start DHT...")
// wait for peers to connect before starting DHT, otherwise DHT bootstrap nodes
// will be empty and we will fail to fill the routing table
case <-d.ctx.Done():
return nil
}

peers = d.h.Network().Peers()
}

for _, p := range peers {
d.bootnodes = append(d.bootnodes, d.h.Peerstore().PeerInfo(p))
}
}

logger.Debug("starting DHT...", "bootnodes", d.bootnodes)

dhtOpts := []dual.Option{
dual.DHTOption(kaddht.Datastore(d.ds)),
dual.DHTOption(kaddht.BootstrapPeers(d.bootnodes...)),
dual.DHTOption(kaddht.V1ProtocolOverride(d.pid + "/kad")),
dual.DHTOption(kaddht.Mode(kaddht.ModeAutoServer)),
}

// create DHT service
dht, err := dual.New(d.ctx, d.h, dhtOpts...)
if err != nil {
return err
}

d.dht = dht
return d.discoverAndAdvertise()
}

func (d *discovery) stop() error {
if d.dht == nil {
return nil
}

return d.dht.Close()
}

func (d *discovery) discoverAndAdvertise() error {
rd := libp2pdiscovery.NewRoutingDiscovery(d.dht)

err := d.dht.Bootstrap(d.ctx)
if err != nil {
return fmt.Errorf("failed to bootstrap DHT: %w", err)
}

// wait to connect to bootstrap peers
time.Sleep(time.Second)
peersToTry := make(map[*peer.AddrInfo]struct{})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

peersToTry could not be a slice of *peer.AddrInfo?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's a map so that there's no duplicates

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move this inside go routine.


go func() {
ttl := initialAdvertisementTimeout

for {
select {
case <-time.After(ttl):
logger.Debug("advertising ourselves in the DHT...")
err := d.dht.Bootstrap(d.ctx)
if err != nil {
logger.Warn("failed to bootstrap DHT", "error", err)
continue
}

ttl, err = rd.Advertise(d.ctx, string(d.pid))
if err != nil {
logger.Debug("failed to advertise in the DHT", "error", err)
ttl = tryAdvertiseTimeout
}
case <-d.ctx.Done():
return
}
}
}()

go func() {
logger.Debug("attempting to find DHT peers...")
peerCh, err := rd.FindPeers(d.ctx, string(d.pid))
if err != nil {
logger.Warn("failed to begin finding peers via DHT", "err", err)
return
}

for {
select {
case <-d.ctx.Done():
return
case <-time.After(connectToPeersTimeout):
if len(d.h.Network().Peers()) > d.minPeers {
continue
}

// reconnect to peers if peer count is low
for p := range peersToTry {
err = d.h.Connect(d.ctx, *p)
if err != nil {
logger.Trace("failed to connect to discovered peer", "peer", p.ID, "err", err)
delete(peersToTry, p)
}
}
case peer := <-peerCh:
if peer.ID == d.h.ID() || peer.ID == "" {
continue
}

logger.Trace("found new peer via DHT", "peer", peer.ID)

// found a peer, try to connect if we need more peers
if len(d.h.Network().Peers()) < d.maxPeers {
err = d.h.Connect(d.ctx, peer)
if err != nil {
logger.Trace("failed to connect to discovered peer", "peer", peer.ID, "err", err)
}
} else {
d.h.Peerstore().AddAddrs(peer.ID, peer.Addrs, peerstore.PermanentAddrTTL)
peersToTry[&peer] = struct{}{}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe will we need to check if the peer that arrived is already on the list?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will be a good idea if we have a limit to add new peers on the peersToTry list? then we avoid a long list, and when the list is already full, we test the connection of the first item (most aged peer) and if the connection fails, then we remove it and add the new peer to the end of the list (newest peer)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for the first comment, the map ensures no duplicates. and yes I'm not 100% sure about using the peersToTry list yet or just using the peerstore, either way I'm going to clean it up before opening the PR

}
}
}
}()

logger.Debug("DHT discovery started!")
return nil
}
Loading