Improve Request Sharding #167

dirkmc · 2019-08-02T19:27:45Z

#165 outlines session request sharding in Bitswap. Each session maintains a list of peers that have responded to a wantlist request with a block, ordered by latency. Requests are split such that groups of CIDs (wants) are sent to groups of peers. As responses come in, the session adjusts the split factor up or down to maintain a balance between redundancy and too many duplicates.

Responses are processed on a block-by-block basis

The live want queue has 32 slots. Once the queue is full, then each time a response block is processed

one slot in the live want queue opens up (the slot the response block CID occupied)
the slot is filled with one CID (want) from the secondary queue
the CID is sent out as a new want request

However the splitter is designed to work over groups of blocks, not a single block. So regardless of the split factor, blocks will always be sent to the same peers until the split factor changes.

For example if there are 8 peers (A - H) and the split factor is 3, the peers will be split into three groups. The CID will be sent to the peers in the first group and the other two groups will be disregarded:

cid        ->  A D G
<none>     ->  B E H
<none>     ->  C F

All blocks will be sent to peers A, D and G until the split factor changes. Regardless of the split factor, blocks will always be sent to peer A.

Shard wants on a streaming basis

It would be better for a few reasons if all blocks in the response were processed as a group, however quite often the response contains a small number of blocks anyway, so any solution should be able to handle a single block as well as several blocks.

There is a WIP PR to adjust the peer list order probabilistically, which should mitigate the problem, as the order of peers will change somewhat with each request. However it may be easier to reason about the problem if wants are instead sharded on a streaming basis.

Sharding should

Probabilistically select N peers to send the block to.
N is analogous to the split factor: it should vary to find a balance between redundancy and too many duplicates.
Select peers with low latency and high probability of having the target block.
Balance load so peers don't get over-loaded.
Ask nearby peers for blocks even if they didn't have any so far.
Asking nearby peers we are already connected to for a block is a relatively cheap operation. If several peers are downloading the same data, we want them to quickly start downloading from each other so as to
- reduce load on the "seed" peer and the network as a whole
- increase bandwidth (download from several peers at once)

The text was updated successfully, but these errors were encountered:

dirkmc · 2019-08-02T20:30:33Z

Asking nearby peers we are already connected to for a block is a relatively cheap operation

@Stebalien do you know if this is true? It seems like it should be but I'm not sure

Stebalien · 2019-08-02T22:28:27Z

The live want queue has 32 slots.

IMO, this is the primary issue. This limit should be per-peer. We may also want some kind of global wantlist rate limit but thats a different issue.

When a we run out of want limit slots for a peer, that peer should be put on hold till a slot opens up.

Asking nearby peers we are already connected to for a block is a relatively cheap operation
@Stebalien do you know if this is true? It seems like it should be but I'm not sure

In general, it is.

hannahhoward · 2019-08-05T22:51:39Z

@Stebalien the live want queue should increase, but I think @dirkmc is right that the current approach tends to mean we aren't actually sharding because we only get one new want at a time. And that would be true even if the live wants increased. I like the streaming idea.

I do also like the idea of removing the session live want list and tracking on a per peer baseis.

dirkmc · 2019-08-08T21:27:28Z

The following proposal takes a step in the direction of a more sophisticated model, but assumes that for now there are no protocol extensions (in #166 we discuss extending the protocol).

The goals are:

Limit want requests per-peer
(rather than per-session)
Treat wants as a stream
So that they can be processed one-by-one (rather than in groups)
Move towards a throughput model
(rather than tracking latency)

Session

Discovery

Follow the current model:

broadcast first CID to all connected peers
search for providers of first CID (using eg DHT)
periodically search for providers of random CID (using eg DHT)

Want Potential

The session orders wants by potential (greatest - lowest), then by FIFO.
The session want potential starts at 2 and varies as the local node receives blocks / duplicates.
The session changes an individual want's potential by

-1 when the want is sent to a peer
+1 when there's a timeout for the want

Per-peer rate limiting (for session)

Note: This section discusses rate-limiting peers on a per-session basis. Below we discuss rate-limiting requests from all sessions to a peer.

We want to trade-off

getting data from responsive nodes
spreading requests across many nodes (overall higher bandwidth)

The session increases the rate-limit for a peer exponentially as it receives blocks. When a peer is discovered:

the session sets the want limit to N wants for the peer
once the session receives a block from the peer, it increases the limit to 2N wants
once the session receives N blocks from the peer, it increases the limit to 4N wants etc

Want Stream

For each want, the session selects a peer probabilistically such that the probability

increases with blocks received from the peer (for this session)
decreases with timeouts from the peer (for this session)
decreases with the size of the peer's queue (across sessions)
zero if the want is already in the peer's queue (across sessions)
zero for peers the session sent the want to in the last T interval (then increases slowly)
increases for newly discovered peers (for this session)

Peer Broker

Peers have a limited-size want request queue (the limit applies to all requests from all sessions). The Peer Broker manages the want queue for each peer.

The Peer Broker performs matching when both these conditions are true:

there is a peer with free space in its queue
there is a session with wants to send

The Peer Broker repeatedly asks the sessions in round robin fashion if they are interested in the currently available peers.

The peer is allowed to send a message each time

the peer's want queue is full
there are no sessions with wants to send

Note that this implies that sessions must decide quickly if they are interested in sending a want to one of the candidate peers, so as not to block peers that are waiting to send out a message.

Stebalien · 2019-08-10T01:23:31Z

Want Potential

"Expected wants"?

Per-peer rate limiting (for session)

We'd also like to avoid forcing the peer to maintain a massive wantlist for us. Although I guess that might be the job of the peer broker.

The session increases the rate-limit for a peer exponentially as it receives blocks. When a peer is discovered:

What's the rational behind this?

decreases with the size of the peer's queue (across sessions)

Note: For the purposes of throughput, separate sessions might as well be
separate peers. We can't know how much load other peers are putting on this node so I wouldn't worry about this that much.

Instead, we should optimize for downloading the content as fast as we can. If we do that, we'll automatically pick the peers with the most available bandwidth.

(Ok, that kind of assumes that these peers are getting their bandwidth maxed out. Doing some local load balancing across peers is still useful, we just shouldn't focus on it too hard).

zero if the want is already in the peer's queue (across sessions)

This should be 1.

zero for peers the session sent the want to in the last T interval (then increases slowly)

?

increases for newly discovered peers (for this session)

You mean starts higher?

Peer Broker

I think this section mixes peer/session a few times.

The Peer Broker repeatedly asks the sessions in round robin fashion if they are interested in the currently available peers.

That sounds really wasteful.

Stebalien · 2019-08-10T01:31:13Z

There are a lot of knobs here that could behave in ways we're not expecting. When possible, let's try to map them directly onto desired effects. Ideally, each one would follow the following rule:

Do X
Throughput increases, do X.
Throughput decreases, do less of X.
Throughput stays the same, do Y (which may be X in some cases).

"Want Potential" does this really well: increase duplicate requests until we start getting duplicate responses, then decrease.

Let's reason through how all the other knobs correlate to throughput (or fairness).

dirkmc · 2019-08-11T23:02:13Z

The session increases the rate-limit for a peer exponentially as it receives blocks

What's the rational behind this?

If we don't rate-limit per session, then the session may request all its blocks from the first peer that responds. For example in a scenario where the per-peer rate-limit is 128, and there haven't been any requests yet:

The session broadcasts a want for the root block CID
Peer A responds after 100ms
The session sends a want for the 20 children of the root block to Peer A
Peer B responds after 200ms

Perhaps in this scenario overall throughput would have been higher if the local node requested 10 blocks from Peer A and 10 blocks from Peer B. So I think we may need to have two rate-limits:

per peer for each session (increasing as blocks come in)
per peer across sessions (so we can max out the peers and send to the ones that free up space in their queues fastest)

I proposed an exponentially increasing session rate-limit per-peer as a way to try to spread the requests across peers in the discovery stage of a session's requests, then quickly move to optimizing for the fastest responding peers as we move out of discovery. I'm open to other ways of doing this.

For each want, the session selects a peer probabilistically such that the probability

zero if the want is already in the peer's queue (across sessions)

zero for peers the session sent the want to in the last T interval (then increases slowly)

Here I'm suggesting that

if the want is already in the queue for a peer, then the probability we choose to send the want to that peer (again) should be zero
if the want is not in the peer's queue, but we recently sent this want to the peer and got a timeout (eg in the last second) then the probability of choosing the peer for the want should be zero for some time interval T (eg 5 x latency). After that time interval, the probability should slowly increase.

increases for newly discovered peers (for this session)

You mean starts higher?

Yes. My thinking is that if nearby peers are downloading the same file from the seed, we should send wants to them so as to maximize throughput.

The Peer Broker repeatedly asks the sessions in round robin fashion if they are interested in the currently available peers.

That sounds really wasteful.

The idea is that we want to process blocks as a stream (one-by-one). The Peer Broker knows how much space each peer has, and repeatedly asks each Session if it would like to send to any of the peers with free space. As the want lists for the peers fill up, the list of candidate peers will change. The peer can send its message only once no more sessions want blocks, or once a peer's want list is full (so that we're not sending lots of messages with a single want).

Let's reason through how all the other knobs correlate to throughput (or fairness).

Sounds like a good approach 👍

dirkmc · 2020-03-04T18:26:17Z

ipfs/kubo#6782

dirkmc mentioned this issue Aug 2, 2019

Improve bitswap performance ipfs-inactive/package-managers#85

Open

dirkmc mentioned this issue Aug 2, 2019

Bitswap Sessions Exploration #165

Closed

dirkmc mentioned this issue Aug 8, 2019

Feat: process response message blocks as a batch #170

Merged

dirkmc mentioned this issue Aug 21, 2019

[Draft] Bitswap Protocol Extensions #186

Closed

meiqimichelle assigned dirkmc Aug 22, 2019

dirkmc closed this as completed Mar 4, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve Request Sharding #167

Improve Request Sharding #167

dirkmc commented Aug 2, 2019

dirkmc commented Aug 2, 2019

Stebalien commented Aug 2, 2019 •

edited

Loading

hannahhoward commented Aug 5, 2019 •

edited

Loading

dirkmc commented Aug 8, 2019 •

edited

Loading

Stebalien commented Aug 10, 2019

Stebalien commented Aug 10, 2019

dirkmc commented Aug 11, 2019

dirkmc commented Mar 4, 2020

Improve Request Sharding #167

Improve Request Sharding #167

Comments

dirkmc commented Aug 2, 2019

Responses are processed on a block-by-block basis

Shard wants on a streaming basis

dirkmc commented Aug 2, 2019

Stebalien commented Aug 2, 2019 • edited Loading

hannahhoward commented Aug 5, 2019 • edited Loading

dirkmc commented Aug 8, 2019 • edited Loading

Session

Discovery

Want Potential

Per-peer rate limiting (for session)

Want Stream

Peer Broker

Stebalien commented Aug 10, 2019

Stebalien commented Aug 10, 2019

dirkmc commented Aug 11, 2019

dirkmc commented Mar 4, 2020

Stebalien commented Aug 2, 2019 •

edited

Loading

hannahhoward commented Aug 5, 2019 •

edited

Loading

dirkmc commented Aug 8, 2019 •

edited

Loading