Critical path towards DHT efficiency and performance #345

raulk · 2019-05-30T11:33:52Z

This issue outlines the critical issues to solve on the road towards a solid, robust and performant DHT implementation.

Critical: termination criteria for DHT queries (avoid backtracking unless necessary): Terminate Queries Correctly #290; there's a WIP PR here: improve query performance by limiting query width to KValue peers #291.
Critical: provider record saturation. Nodes close to popular content get flooded with provider records. Related: Limit the size of provider sets #316 DHT provider records: adopt sloppiness, rejection and backtracking from Coral specs#163. The latter solution has the property that, the more popular the content gets, the more widely advertised it is on the DHT, thus making queries faster.
Critical: dissociating the routing table from the connection table. Currently when a peer disconnects, we drop it from the routing table. This is not proper Kademlia. Dissociating routing table membership from connection state #283
Critical: participate in the DHT in client-only mode until we receive our first inbound connection via a non-relay transport. This will be tricky, but will help emergent stability and responsiveness. Add an "Auto" client/server DHT mode #216
Important: persisting and seeding the routing table via a Snapshotter and Seeder, using bootstrap peers only as a fallback. Persist routing table #254 Explicitly set bootstrap/fallback nodes #295; WIP PR here: WIP: Persisting/seeding a routing table #315.
Important: routing table membership based on peer quality and failure counting. Eject peers who misbehave, present high latency, or fail frequently.
Cool: an alternative data structure for the routing table.

We could potentially use the brand new libp2p testlab to continuously measure the impact of the changes we make.

If you're willing to help in pushing the DHT to new heights, please comment below ;-)

The text was updated successfully, but these errors were encountered:

aarshkshah1992 · 2019-07-13T12:40:20Z

@raulk I've only just started contributing to DHT & am currently working on correctly implementing the DHT bootsrapping. That task combined with the reading of the DHT paper/meandering around the codebase should give me a fairly good idea of the DHT codebase. Please count me in as & when we start focusing on this epic.

rgrover · 2019-07-22T02:00:23Z

The routing table is currently a linear collection of kbuckets with increasing common-prefix-lengths, potentially resulting in significant contention for the first few k-buckets.

Assuming that bits is an array-view over bits in own PeerId, the first kbucket in the routing table corresponds to the prefix ~bits[0], the second to bits[0] ~bits[1], and so on. Bucket[0] will be contended by around half of all available peers.

The original Kademlia recommends an LRU eviction policy for buckets filled to capacity, but libp2p-kad-dht only ever evicts disconnected or lost peers. With a value of k set to 20, this means that after learning about 40 peers, nearly half of all new peers are unable to enter the routing-table.

Possible solutions would be to either a) maintain a replacement list alongside each kbucket for nodes which are waiting for entry to the kbucket, b) allow the first few k-buckets to have capacity larger than k, c) evict peers from kbuckets based on some policy (such as LRU or latency), or d) use a recursive tree data-structure (instead of an array) and allow splits (up to a certain depth b) for kbuckets not on the prefix-path of own-PeerId, as suggested in the Kademlia paper.

It would also help to periodically prune unreachable peers out of the routing table in a proactive manner.

raulk · 2019-07-22T13:03:02Z

@aarshkshah1992 – hey, thanks for reaching out! #315 (PR for persisting the routing table) is heading in a good direction and would benefit from somebody pushing it over the finish line. Is this something that catches your attention?

aarshkshah1992 · 2019-07-23T01:13:35Z

@raulk Perfect ! Please can you assign it to me ? Will get back with my comments as soon I've gone though the existing comments & code.

aarshkshah1992 · 2019-09-28T01:54:05Z

@raulk What's next for us here ? Is there anything I can help with ?

daviddias · 2019-12-11T15:54:07Z

Critical: termination criteria for DHT queries (avoid backtracking unless necessary):

@raulk what does backtracing mean exactly?

daviddias · 2019-12-11T15:54:47Z

Critical: provider record saturation. Nodes close to popular content get flooded with provider records. Related: #316 libp2p/specs#163. The latter solution has the property that, the more popular the content gets, the more widely advertised it is on the DHT, thus making queries faster.

There are a few more interesting and proven ways to effectively do load balancing n structured P2P networks

Power of Two Choices (Byers et al. 2003) - Uses multiple hash functions to calculate different locations for an object, opts to store it in the least loaded node, where the other Nodes store a pointer. This approach is very simple, however it adds a lot of overhead when inserting data, however there is a proposed alternative of not using the pointers, which has the trade-off of increasing the message overhead at search.
Virtual Servers (Rao et al. 2003) - Presents the concept of virtualization the Node entity to easy transfer it among the machines present in the P2P network. It uses two approaches, ‘one-to-one’, where nodes contact other Nodes inside the network with the expectation of being able to trade some of the load, shifting a virtual server, or an ‘one-to-many/many- to-many’ in which a directory of load per node is built, so that a node can make a query in order to find it’s perfect match to distribute his load. Virtual Servers approach has the major issue of adding a extra amount of work to maintain the finger tables in each node.
Thermal-Dissipation-based Approach (Rieche et al. 2004) - Inspired by the heat expansion process, this algorithm shifts nodes position inside the hash ring windows of load responsibility, in a way that the load will implicitly flow from a node to it’s close peers.
Simple Address-Space and Item Balancing (Karger & Ruhl 2004) - It is an iteration over the virtual servers, by assigning several virtual nodes to each physical node, where only one of which is active at a time and this is only changed if having a different nodeId distribution in the network brings a more load balanced hash ring

daviddias · 2019-12-11T15:56:59Z

Critical: dissociating the routing table from the connection table. Currently when a peer disconnects, we drop it from the routing table. This is not proper Kademlia. #283

True. I believe this is an artifact of operating in both Public & NAT'ed networks. The rule of thumb should be more like:

When a NAT'ed node looses a connection to a public node, it should preserve that node info in the routing table as connecting to it will be trivial
When a Public node looses a connection to a NAT'ed node, it should drop it from its routing table as it will be potentially a nightmare to dial to that node again

daviddias · 2019-12-11T16:06:05Z

Critical: participate in the DHT in client-only mode until we receive our first inbound connection via a non-relay transport. This will be tricky, but will help emergent stability and responsiveness. #216

What about transports such as WebRTC? It might be wise to have a double rule that only peers that are "directly diable" (i.e. that any multiple nodes in the network can dial directly without any hole punching, WebRTC Stun and/or Relay dance) should ascend to the mainnet DHT.

Grabbing a old illustration to make the point (not sure if it helps :))

Nodes should only "ascend" if they prove that they are willing and committed to provide a good service.

daviddias · 2019-12-11T16:12:13Z

Important: routing table membership based on peer quality and failure counting. Eject peers who misbehave, present high latency, or fail frequently.

Is there any work on this at the moment?

yiannisbot · 2020-02-02T12:16:22Z

Nodes should only "ascend" if they prove that they are willing and committed to provide a good service.

This points to a score function for nodes, which is constantly updated based on node performance. There is plenty of literature related to this, but need to do a proper filtering.

Important: routing table membership based on peer quality and failure counting. Eject peers who misbehave, present high latency, or fail frequently.

An idea that has been thrown around off-band is to not only have routing table membership based on quality and performance (see above), but also latency to the node in the routing entry. This is a form of priority function, where the routing table is sorting entries according to node priority. It is more flexible than Coral, as it generalises the priority function and allows to include more parameters than the RTT only.

yiannisbot · 2020-02-07T07:40:41Z

Important: routing table membership based on peer quality and failure counting. Eject peers who misbehave, present high latency, or fail frequently.

An idea that has been thrown around off-band is to not only have routing table membership based on quality and performance (see above), but also latency to the node in the routing entry. This is a form of priority function, where the routing table is sorting entries according to node priority. It is more flexible than Coral, as it generalises the priority function and allows to include more parameters than the RTT only.

Then thinking about it and digging a bit deeper, it turns out that this approach opens an attack vector to DHT k-bucket routing tables: if nodes sort entries in their buckets according to performance and are returning the top entries upon request (because of top performance), authors in [*] show that it only takes 2 different IP addresses in different /24 subnets to attack the Ethereum network and eclipse peers. This is because an attacker only needs to occupy the first position in every bucket (which is not expensive to do) - see Section IV and IV.A in particular.

[*] Eclipsing Ethereum Peers with False Friends

aschmahmann · 2020-02-07T16:29:22Z

@yiannisbot AFAIK that paper's attack doesn't really apply to us at all. It's based on two assumptions:

PeerIDs are easy to generate/forge
There is a property of a peer that is difficult to forge and only a limited number of peers per bucket can have (i.e. each bucket can have at most one peer with an IP address in a /24 subnet and only 10 buckets may contain an address in that /24 subnet)

If property 1 is true (as is assumed in S/Kademlia) then Kademlia can be Sybil attacked pretty easily without any countermeasures (e.g. expensive peerID generation, reputation, stake, etc.)

However, if you choose to assume (as the ethereum folks did) that property 2 will save you from the problems of property 1, then the problems in the paper end up surfacing. Since we're not currently doing anything special with IP addresses the problems surfaced seem not particularly important (i.e. we have bigger Sybil attack problems to worry about).

Additionally, I'm not really sure I understand why that attack should have worked on the Ethereum network. They are just trying to find some semi random distribution of peers to talk with, so why didn't they limit their peer connections to only 1 per /24 subnet (taking their assumption that having different IP addresses is very expensive as true, which 🤷‍♂)? If they did that then it seems like when an adversarial peer returns k peers very close to the lookup target that they'll all need to be in different /24 subnets which they deemed infeasible.

raulk · 2021-01-27T17:31:32Z

EXTRA, EXTRA, most items on this epic are completed! Only these two are left:

persisting the routing table
exploring alternative data structures

Stebalien · 2021-07-22T18:19:22Z

@raulk I think this epic is probably safe to close. Is there anything that needs to be broken out into new issues?

raulk mentioned this issue Jul 23, 2019

WIP: Persisting/seeding a routing table #315

Closed

daviddias mentioned this issue Dec 11, 2019

Impl Plan - The go-libp2p DHT behaves correctly (essential plan to unblock go-ipfs 0.5.0 testing) testground/testground#97

Closed

daviddias mentioned this issue Apr 8, 2020

Equiping the Hydra-Booster with the BFR (Accelarate the .Provide for really large files -> millions of records) ipfs/notes#430

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Critical path towards DHT efficiency and performance #345

Critical path towards DHT efficiency and performance #345

raulk commented May 30, 2019 •

edited

Loading

aarshkshah1992 commented Jul 13, 2019

rgrover commented Jul 22, 2019

raulk commented Jul 22, 2019

aarshkshah1992 commented Jul 23, 2019

aarshkshah1992 commented Sep 28, 2019 •

edited

Loading

daviddias commented Dec 11, 2019

daviddias commented Dec 11, 2019

daviddias commented Dec 11, 2019

daviddias commented Dec 11, 2019

daviddias commented Dec 11, 2019

yiannisbot commented Feb 2, 2020

yiannisbot commented Feb 7, 2020

aschmahmann commented Feb 7, 2020 •

edited

Loading

raulk commented Jan 27, 2021 •

edited

Loading

Stebalien commented Jul 22, 2021

Critical path towards DHT efficiency and performance #345

Critical path towards DHT efficiency and performance #345

Comments

raulk commented May 30, 2019 • edited Loading

aarshkshah1992 commented Jul 13, 2019

rgrover commented Jul 22, 2019

raulk commented Jul 22, 2019

aarshkshah1992 commented Jul 23, 2019

aarshkshah1992 commented Sep 28, 2019 • edited Loading

daviddias commented Dec 11, 2019

daviddias commented Dec 11, 2019

daviddias commented Dec 11, 2019

daviddias commented Dec 11, 2019

daviddias commented Dec 11, 2019

yiannisbot commented Feb 2, 2020

yiannisbot commented Feb 7, 2020

aschmahmann commented Feb 7, 2020 • edited Loading

raulk commented Jan 27, 2021 • edited Loading

Stebalien commented Jul 22, 2021

raulk commented May 30, 2019 •

edited

Loading

aarshkshah1992 commented Sep 28, 2019 •

edited

Loading

aschmahmann commented Feb 7, 2020 •

edited

Loading

raulk commented Jan 27, 2021 •

edited

Loading