[ipfs/go-bitswap] Bitswap Strategy Implementation #120

dgrisham · 2019-11-20T06:58:40Z

Hi!

Awhile back, I was working on an implementation of Bitswap strategies for my master's thesis. I took a break from the thesis, but am starting back to finish it in the Spring. I left off on a now outdated version of Bitswap (see https://github.com/dgrisham/go-bitswap, with my implementation specifically on the impl/strategy-prq branch). I've been passively watching the progress of this repo and noticed that a lot of improvements have been made, so I've been wondering -- is it in a solid/stable-enough state for me to re-implement strategies in the current version (with periodic rebasing on the changes to come, of course)?

Additionally, I want this update to be useful in actually implementing strategies into Bitswap if that's still one of the goals here. I'll be pretty tight on time with getting the thesis done, but I'd like to have open discussion on here as I do this implementation to get feedback from everyone as I go. I know this is mostly a matter of making a PR/issues and starting discussions, but just wanted to make a note in anticipation of those things.

The text was updated successfully, but these errors were encountered:

dirkmc · 2019-11-20T15:56:22Z

Hi David, thanks for the heads up, I'm interested to hear more about your work.

We're actively making improvements, and in fact there is a big change coming - we're adding some new message types to the protocol, so as to better disseminate knowledge of block distribution across the network. You can follow our progress in this PR and I recently made a presentation about the changes (slides).

I would suggest waiting until these changes land as they touch almost all of the code base, and are still under heavy development.

Can you provide any information about the research you've been doing?

Thanks,
Dirk

Stebalien · 2019-11-20T19:05:17Z

There's a pretty important change you'll be interested in: we now keep track block sizes in our work queue and prioritize peers with the least work "in flight" (least amount of data being sent via a worker).

dgrisham · 2019-12-09T00:51:47Z

Awesome, thanks for the updates! As for the research, it's generally a foray into implementing support for strategies in Bitswap, testing simple strategies under a few different topologies, etc. There's a bit of theoretical backing, but it'll mostly be simulations (with a numerical-esque analysis in Python) and semi-real-world experiments (using IPTB).

The gateway to all of that is here, note that I have to clean some of this up/update it because it's been awhile. https://github.com/dgrisham/masters

dgrisham · 2020-02-28T22:25:49Z

Wanted to follow up on this now that I've gotten a bit more up to speed with the current version of the code.

It looks like a form of 'Bitswap strategies' as described in the original IPFS whitepaper has been implemented (particularly around the calculations in this loop, and the corresponding updating of 'peer tags' here).

My first thought is that it'd be interesting to play with options of the ewma() used to calculate the long/short scores. Are there any other thoughts you (@dirkmc, @Stebalien, anyone) has around what else might be interesting to vary along these lines?

Stebalien · 2020-02-28T23:27:57Z

FYI, we don't actually preferentially send blocks to better peers (although we could try that). We just avoid disconnecting from them. The goal was to try to avoid disconnecting from peers with which we're currently exchanging data just because some new peers are asking us DHT questions.

I'd love to play with those values but that's likely going to require some real-world usage information. We'd want to see how tuning those values on, e.g., a gateway and/or a popular cluster affects how frequently we need to re-connect to peers.

Really, we could probably just connect bitswap events (we sent/received a block from peer X at time T), limit the maximum number of connections allowed, then learn the optimal strategy that reduces the number of reconnect events from this information (offline).

cc @daviddias this might be of interest to ResNetLab.

dgrisham · 2020-03-06T19:13:06Z

@Stebalien gotcha, thanks for the clarification. I would like to play with allocating more resources (bandwidth/data) to 'better' peers. When I last implemented that (which was on a much older version of Bitswap, here's the relevant PR), I ended up having to diverge from the PRQ implementation quite a bit. This was because the PRQ didn't seem to have a straightforward way to provide weights to peers you want to send more data to within a given period of time, and so it made sense to implement a separate queue for a more natural implementation of this idea.

The current implementation, while more built out, seems to have a similar limitation in this regard. So my question here is (probably for @dirkmc, or @Stebalien): do you see a good way to work this idea into the current implementation? Or would it make sense for me to bring the round robin queue/'strategy prq' ideas into the current version of go-bitswap and test that way?

Stebalien · 2020-03-07T00:48:58Z

I'd add multiple lanes to https://github.com/ipfs/go-peertaskqueue. With a SetPriority(peer, priority) function, you'd be able to move peers between lanes. This should cost us pretty much nothing in the fast-case where we never call SetPriority.

daviddias · 2020-03-07T05:11:39Z

cc @daviddias this might be of interest to ResNetLab.

We are indeed. We documented the "block/graph exchange" as an Open Problem looking for optimizations -- https://github.com/ipfs/notes/blob/master/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md -- (it is unlikely that we will devote effort to it until the end of Q2 though). Added this thread to the Open Problem ipfs/notes@59247c3

dgrisham · 2020-03-13T19:40:10Z

@Stebalien if I'm understanding that correctly, we'd have N lanes where we set the priority of a peer from [1, N], and serve peers in their relative proportions that way. Then the PeerCompare function would further sort peers within those lanes.

That seems good, the other option I see would be to do a similar thing as the round robin queue I mentioned by maintaining a priority value for each peer and in a given round serving each peer until either 1. they're served their relative proportion for the round or 2. they've been served all of the blocks on their wantlist. I think the primary benefits would be 1. higher discretization of the 'peer priority' space, and 2. not having to tune N. However, this would be a bigger deviation from the current implementation and lose the benefits that the current PeerCompare function provides. Your proposed solution seems to strike a nice balance that leverages the existing functionality while introducing strategies around it.

dgrisham · 2020-03-13T23:23:38Z

Follow up: here's a first pass swing at implementing that idea dgrisham/go-peertaskqueue@07f7cf0

(edit: updated commit hash)

Stebalien · 2020-03-14T00:08:37Z

That looks like a very clean approach. The only real issue is that we currently delete "idle" peers from the tracker to free memory. We'll have to find some way to keep the priority info.

dgrisham · 2020-04-17T21:09:52Z

Circling back on this now -- I actually wasn't thinking too clearly when I wrote that, because it doesn't quite do what I wanted. Calling Priority was part of that I think, Weight is a better term. That implementation just determines the order in which we send peers, but doesn't actually send out data over time based on the relative weights.

I'll post again when I have a first pass of an implementation that does that haha.

dgrisham · 2020-04-20T17:54:30Z

So, I ended up going a slightly different route with the implementation. Instead of using multiple queues/lanes, I just use a single queue assign a weight to every peer. The idea then is that we serve a certain amount of data per round, allocate a certain amount of that total data to each peer based on their weight, then serve every peer their allocated data. Peer trackers are removed from the queue once their allocated data has been exhausted, and re-added once we've exhausted a round and re-calculate the relative weights. This lets use continue to leverage the heuristic ordering of the queue as it's currently implemented, while also implementing this idea of weighting peers based on their relative contributions.

Here is the updated implementation of peertaskqueue, and here are the relevant changes in go-bitswap to set the peer weights.

Also to answer your question @Stebalien, the weights are pulled from the bitswap ledger values, so unless I'm missing something I think those values are maintained already + reset whenever the peer is added again (as per the changes in my go-bitswap repo).

Stebalien · 2020-04-20T19:34:18Z

https://github.com/dgrisham/go-peertaskqueue/blob/289e11006d4ff9ba3920203bd89f43fc9d8c5be0/peertaskqueue.go#L209-L212

I'd leave "min work" alone. This is to try to reduce the overhead of sending small messages/packets. If we end up doing a bit more work, that's not the end of the world.

https://github.com/dgrisham/go-peertaskqueue/blob/289e11006d4ff9ba3920203bd89f43fc9d8c5be0/peertaskqueue.go#L138-L149

Something in here will need to update the queue's position in the heap if the weight has changed, right? Or will we just wait for the next round.

Use "work", not "remaining data". In bitswap, work == data, but this isn't true for graphsync.

The code for updating the remaining data looks wrong. It should use the actual amount of work done.

Requesters that go from 0 -> 1 open requests will have to wait until the next round, from what I can tell. That may be fine, but we may also want to consider giving new peers entering a round some kind of prorated quota.

Instead of moving peers off the priority queue, can't we just take remaining data/work into account when sorting? That is, peers with no "remaining data" get pushed to the back of the queue, when we pop a peer with no remaining work, we start the next round.

dgrisham · 2020-04-20T21:14:58Z

I'd leave "min work" alone. This is to try to reduce the overhead of sending small messages/packets. If we end up doing a bit more work, that's not the end of the world.

👌 got it, changed

Something in here will need to update the queue's position in the heap if the weight has changed, right? Or will we just wait for the next round.

Yeah, I figured we'd just have it wait until the next round. That's how BitTorrent/others work, and seems sufficient.

Use "work", not "remaining data". In bitswap, work == data, but this isn't true for graphsync.

Ah, for the name? Got it. Should we also rename the PeerTracker.served field I added to work, or something else?

The code for updating the remaining data looks wrong. It should use the actual amount of work done.

Ah, thanks, that was my bad. Is len(out) a reliable way to do that?

Requesters that go from 0 -> 1 open requests will have to wait until the next round, from what I can tell. That may be fine, but we may also want to consider giving new peers entering a round some kind of prorated quota.

Gotcha, yeah we can consider that. Part of the idea was to make rounds short enough to avoid the length starving other peers, but maybe we can just work around that altogether so we can tune the round length however we want.

Instead of moving peers off the priority queue, can't we just take remaining data/work into account when sorting? That is, peers with no "remaining data" get pushed to the back of the queue, when we pop a peer with no remaining work, we start the next round.

I like that, let me try implementing it.

Stebalien · 2020-04-20T23:02:18Z

Ah, for the name? Got it. Should we also rename the PeerTracker.served field I added to work, or something else?

Well, I'd call it workDone but isn't it actually workRemaining?

Ah, thanks, that was my bad. Is len(out) a reliable way to do that?

Currently, PopTasks returns the pending work. We should probably return both the pending work and the amount of work we just popped.

dgrisham · 2020-04-20T23:13:41Z

Yeah, workRemaining makes sense. Updated PopTasks() as well, version with all of the discussed changes is here: https://github.com/dgrisham/go-peertaskqueue/blob/feat/peer-priority-lanes/peertaskqueue.go.

dgrisham · 2020-07-06T17:40:58Z

Implemented tests for this a little bit ago, thought I'd post here to get any thoughts on more tests. The implementation itself is straightforward, so the tests seem sufficient to me, but I'm happy to get any other thoughts! They're the TestPeerWeight* tests here.

@Stebalien

Stebalien · 2020-07-06T18:36:23Z

@dirkmc do you have bandwidth to review this? Otherwise, cc @aschmahmann.

dirkmc · 2020-07-06T22:34:58Z

@dgrisham just to clarify, what would you like review feedback on? Just the tests? Or do you have a set of changes in a PR that you'd like reviewed?

dgrisham · 2020-07-07T02:50:42Z

@dirkmc mostly review feedback on the tests, but I can point you to the relevant code changes as well. I'd like to pull a PR together to add this as an experimental feature as well if everyone is comfortable with that, but I still need to add that to the config + thread that into the implementation.

…

On Mon, Jul 6, 2020 at 4:35 PM dirkmc ***@***.***> wrote: @dgrisham <https://github.com/dgrisham> just to clarify, what would you like review feedback on? Just the tests? Or do you have a set of changes in a PR that you'd like reviewed? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#120>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAWFF27AVIEDTAKLUH34THDR2JGR7ANCNFSM4JPOGZ3Q> .

dirkmc · 2020-07-07T15:18:44Z

Overall the tests look good to me.

I'd suggest that rather than using random numbers of peers and block sizes, you select a few that test important values (eg boundaries) and run the tests with those. That way the tests will be deterministic. Otherwise tests might pass now but suddenly fail later and it won't be clear why.

For example:

testPeerWeightsSingleRoundConstantWeights(t *testing.T, numPeers int, blockSize int) { ... }

TestPeerWeightsSingleRoundConstantWeights(t *testing.T) {
  numPeers := []int{1,2,4,8,16}
  blockSizes := []int{1,2,4,8,16}
  for _, np := range numPeers {
    for _, sz := range blockSizes {
      testPeerWeightsSingleRoundConstantWeights(t, np, sz)
    }
  }
}

welcome · 2023-01-27T22:50:58Z

Thank you for submitting your first issue to this repository! A maintainer will be here shortly to triage and review.
In the meantime, please double-check that you have provided all the necessary information to make this process easy! Any information that can help save additional round trips is useful! We currently aim to give initial feedback within two business days. If this does not happen, feel free to leave a comment.
Please keep an eye on how this issue will be labeled, as labels give an overview of priorities, assignments and additional actions requested by the maintainers:

"Priority" labels will show how urgent this is for the team.
"Status" labels will show if this is ready to be worked on, blocked, or in progress.
"Need" labels will indicate if additional input or analysis is required.

Finally, remember to use https://discuss.ipfs.io if you just need general support.

This comment has been minimized.

Sign in to view

Jorropo changed the title ~~Bitswap Strategy Implementation~~ [ipfs/go-bitswap] Bitswap Strategy Implementation Jan 27, 2023

Jorropo transferred this issue from ipfs/go-bitswap Jan 27, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ipfs/go-bitswap] Bitswap Strategy Implementation #120

[ipfs/go-bitswap] Bitswap Strategy Implementation #120

dgrisham commented Nov 20, 2019

dirkmc commented Nov 20, 2019

Stebalien commented Nov 20, 2019

dgrisham commented Dec 9, 2019

dgrisham commented Feb 28, 2020

Stebalien commented Feb 28, 2020

dgrisham commented Mar 6, 2020

Stebalien commented Mar 7, 2020

daviddias commented Mar 7, 2020

dgrisham commented Mar 13, 2020

dgrisham commented Mar 13, 2020 •

edited

Loading

Stebalien commented Mar 14, 2020 via email

dgrisham commented Apr 17, 2020

dgrisham commented Apr 20, 2020

Stebalien commented Apr 20, 2020

dgrisham commented Apr 20, 2020

Stebalien commented Apr 20, 2020

dgrisham commented Apr 20, 2020

This comment has been minimized.

This comment has been minimized.

dgrisham commented Jul 6, 2020

Stebalien commented Jul 6, 2020

dirkmc commented Jul 6, 2020

dgrisham commented Jul 7, 2020 via email

dirkmc commented Jul 7, 2020

welcome bot commented Jan 27, 2023

[ipfs/go-bitswap] Bitswap Strategy Implementation #120

[ipfs/go-bitswap] Bitswap Strategy Implementation #120

Comments

dgrisham commented Nov 20, 2019

dirkmc commented Nov 20, 2019

Stebalien commented Nov 20, 2019

dgrisham commented Dec 9, 2019

dgrisham commented Feb 28, 2020

Stebalien commented Feb 28, 2020

dgrisham commented Mar 6, 2020

Stebalien commented Mar 7, 2020

daviddias commented Mar 7, 2020

dgrisham commented Mar 13, 2020

dgrisham commented Mar 13, 2020 • edited Loading

Stebalien commented Mar 14, 2020 via email

dgrisham commented Apr 17, 2020

dgrisham commented Apr 20, 2020

Stebalien commented Apr 20, 2020

dgrisham commented Apr 20, 2020

Stebalien commented Apr 20, 2020

dgrisham commented Apr 20, 2020

This comment has been minimized.

This comment has been minimized.

dgrisham commented Jul 6, 2020

Stebalien commented Jul 6, 2020

dirkmc commented Jul 6, 2020

dgrisham commented Jul 7, 2020 via email

dirkmc commented Jul 7, 2020

welcome bot commented Jan 27, 2023

dgrisham commented Mar 13, 2020 •

edited

Loading