Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ipfs/go-bitswap] Bitswap Strategy Implementation #120

Open
dgrisham opened this issue Nov 20, 2019 · 1 comment
Open

[ipfs/go-bitswap] Bitswap Strategy Implementation #120

dgrisham opened this issue Nov 20, 2019 · 1 comment

Comments

@dgrisham
Copy link
Member

Hi!

Awhile back, I was working on an implementation of Bitswap strategies for my master's thesis. I took a break from the thesis, but am starting back to finish it in the Spring. I left off on a now outdated version of Bitswap (see https://github.com/dgrisham/go-bitswap, with my implementation specifically on the impl/strategy-prq branch). I've been passively watching the progress of this repo and noticed that a lot of improvements have been made, so I've been wondering -- is it in a solid/stable-enough state for me to re-implement strategies in the current version (with periodic rebasing on the changes to come, of course)?

Additionally, I want this update to be useful in actually implementing strategies into Bitswap if that's still one of the goals here. I'll be pretty tight on time with getting the thesis done, but I'd like to have open discussion on here as I do this implementation to get feedback from everyone as I go. I know this is mostly a matter of making a PR/issues and starting discussions, but just wanted to make a note in anticipation of those things.

@dirkmc
Copy link
Contributor

dirkmc commented Nov 20, 2019

Hi David, thanks for the heads up, I'm interested to hear more about your work.

We're actively making improvements, and in fact there is a big change coming - we're adding some new message types to the protocol, so as to better disseminate knowledge of block distribution across the network. You can follow our progress in this PR and I recently made a presentation about the changes (slides).

I would suggest waiting until these changes land as they touch almost all of the code base, and are still under heavy development.

Can you provide any information about the research you've been doing?

Thanks,
Dirk

@Stebalien
Copy link
Member

There's a pretty important change you'll be interested in: we now keep track block sizes in our work queue and prioritize peers with the least work "in flight" (least amount of data being sent via a worker).

@dgrisham
Copy link
Member Author

dgrisham commented Dec 9, 2019

Awesome, thanks for the updates! As for the research, it's generally a foray into implementing support for strategies in Bitswap, testing simple strategies under a few different topologies, etc. There's a bit of theoretical backing, but it'll mostly be simulations (with a numerical-esque analysis in Python) and semi-real-world experiments (using IPTB).

The gateway to all of that is here, note that I have to clean some of this up/update it because it's been awhile. https://github.com/dgrisham/masters

@dgrisham
Copy link
Member Author

Wanted to follow up on this now that I've gotten a bit more up to speed with the current version of the code.

It looks like a form of 'Bitswap strategies' as described in the original IPFS whitepaper has been implemented (particularly around the calculations in this loop, and the corresponding updating of 'peer tags' here).

My first thought is that it'd be interesting to play with options of the ewma() used to calculate the long/short scores. Are there any other thoughts you (@dirkmc, @Stebalien, anyone) has around what else might be interesting to vary along these lines?

@Stebalien
Copy link
Member

FYI, we don't actually preferentially send blocks to better peers (although we could try that). We just avoid disconnecting from them. The goal was to try to avoid disconnecting from peers with which we're currently exchanging data just because some new peers are asking us DHT questions.

I'd love to play with those values but that's likely going to require some real-world usage information. We'd want to see how tuning those values on, e.g., a gateway and/or a popular cluster affects how frequently we need to re-connect to peers.

Really, we could probably just connect bitswap events (we sent/received a block from peer X at time T), limit the maximum number of connections allowed, then learn the optimal strategy that reduces the number of reconnect events from this information (offline).

cc @daviddias this might be of interest to ResNetLab.

@dgrisham
Copy link
Member Author

dgrisham commented Mar 6, 2020

@Stebalien gotcha, thanks for the clarification. I would like to play with allocating more resources (bandwidth/data) to 'better' peers. When I last implemented that (which was on a much older version of Bitswap, here's the relevant PR), I ended up having to diverge from the PRQ implementation quite a bit. This was because the PRQ didn't seem to have a straightforward way to provide weights to peers you want to send more data to within a given period of time, and so it made sense to implement a separate queue for a more natural implementation of this idea.

The current implementation, while more built out, seems to have a similar limitation in this regard. So my question here is (probably for @dirkmc, or @Stebalien): do you see a good way to work this idea into the current implementation? Or would it make sense for me to bring the round robin queue/'strategy prq' ideas into the current version of go-bitswap and test that way?

@Stebalien
Copy link
Member

I'd add multiple lanes to https://github.com/ipfs/go-peertaskqueue. With a SetPriority(peer, priority) function, you'd be able to move peers between lanes. This should cost us pretty much nothing in the fast-case where we never call SetPriority.

@daviddias
Copy link
Member

cc @daviddias this might be of interest to ResNetLab.

We are indeed. We documented the "block/graph exchange" as an Open Problem looking for optimizations -- https://github.com/ipfs/notes/blob/master/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md -- (it is unlikely that we will devote effort to it until the end of Q2 though). Added this thread to the Open Problem ipfs/notes@59247c3

@dgrisham
Copy link
Member Author

@Stebalien if I'm understanding that correctly, we'd have N lanes where we set the priority of a peer from [1, N], and serve peers in their relative proportions that way. Then the PeerCompare function would further sort peers within those lanes.

That seems good, the other option I see would be to do a similar thing as the round robin queue I mentioned by maintaining a priority value for each peer and in a given round serving each peer until either 1. they're served their relative proportion for the round or 2. they've been served all of the blocks on their wantlist. I think the primary benefits would be 1. higher discretization of the 'peer priority' space, and 2. not having to tune N. However, this would be a bigger deviation from the current implementation and lose the benefits that the current PeerCompare function provides. Your proposed solution seems to strike a nice balance that leverages the existing functionality while introducing strategies around it.

@dgrisham
Copy link
Member Author

dgrisham commented Mar 13, 2020

Follow up: here's a first pass swing at implementing that idea dgrisham/go-peertaskqueue@07f7cf0

(edit: updated commit hash)

@Stebalien
Copy link
Member

Stebalien commented Mar 14, 2020 via email

@dgrisham
Copy link
Member Author

Circling back on this now -- I actually wasn't thinking too clearly when I wrote that, because it doesn't quite do what I wanted. Calling Priority was part of that I think, Weight is a better term. That implementation just determines the order in which we send peers, but doesn't actually send out data over time based on the relative weights.

I'll post again when I have a first pass of an implementation that does that haha.

@dgrisham
Copy link
Member Author

So, I ended up going a slightly different route with the implementation. Instead of using multiple queues/lanes, I just use a single queue assign a weight to every peer. The idea then is that we serve a certain amount of data per round, allocate a certain amount of that total data to each peer based on their weight, then serve every peer their allocated data. Peer trackers are removed from the queue once their allocated data has been exhausted, and re-added once we've exhausted a round and re-calculate the relative weights. This lets use continue to leverage the heuristic ordering of the queue as it's currently implemented, while also implementing this idea of weighting peers based on their relative contributions.

Here is the updated implementation of peertaskqueue, and here are the relevant changes in go-bitswap to set the peer weights.

Also to answer your question @Stebalien, the weights are pulled from the bitswap ledger values, so unless I'm missing something I think those values are maintained already + reset whenever the peer is added again (as per the changes in my go-bitswap repo).

@Stebalien
Copy link
Member

https://github.com/dgrisham/go-peertaskqueue/blob/289e11006d4ff9ba3920203bd89f43fc9d8c5be0/peertaskqueue.go#L209-L212

I'd leave "min work" alone. This is to try to reduce the overhead of sending small messages/packets. If we end up doing a bit more work, that's not the end of the world.


https://github.com/dgrisham/go-peertaskqueue/blob/289e11006d4ff9ba3920203bd89f43fc9d8c5be0/peertaskqueue.go#L138-L149

Something in here will need to update the queue's position in the heap if the weight has changed, right? Or will we just wait for the next round.


Use "work", not "remaining data". In bitswap, work == data, but this isn't true for graphsync.


The code for updating the remaining data looks wrong. It should use the actual amount of work done.


Requesters that go from 0 -> 1 open requests will have to wait until the next round, from what I can tell. That may be fine, but we may also want to consider giving new peers entering a round some kind of prorated quota.


Instead of moving peers off the priority queue, can't we just take remaining data/work into account when sorting? That is, peers with no "remaining data" get pushed to the back of the queue, when we pop a peer with no remaining work, we start the next round.

@dgrisham
Copy link
Member Author

I'd leave "min work" alone. This is to try to reduce the overhead of sending small messages/packets. If we end up doing a bit more work, that's not the end of the world.

👌 got it, changed

Something in here will need to update the queue's position in the heap if the weight has changed, right? Or will we just wait for the next round.

Yeah, I figured we'd just have it wait until the next round. That's how BitTorrent/others work, and seems sufficient.

Use "work", not "remaining data". In bitswap, work == data, but this isn't true for graphsync.

Ah, for the name? Got it. Should we also rename the PeerTracker.served field I added to work, or something else?

The code for updating the remaining data looks wrong. It should use the actual amount of work done.

Ah, thanks, that was my bad. Is len(out) a reliable way to do that?

Requesters that go from 0 -> 1 open requests will have to wait until the next round, from what I can tell. That may be fine, but we may also want to consider giving new peers entering a round some kind of prorated quota.

Gotcha, yeah we can consider that. Part of the idea was to make rounds short enough to avoid the length starving other peers, but maybe we can just work around that altogether so we can tune the round length however we want.

Instead of moving peers off the priority queue, can't we just take remaining data/work into account when sorting? That is, peers with no "remaining data" get pushed to the back of the queue, when we pop a peer with no remaining work, we start the next round.

I like that, let me try implementing it.

@Stebalien
Copy link
Member

Ah, for the name? Got it. Should we also rename the PeerTracker.served field I added to work, or something else?

Well, I'd call it workDone but isn't it actually workRemaining?

Ah, thanks, that was my bad. Is len(out) a reliable way to do that?

Currently, PopTasks returns the pending work. We should probably return both the pending work and the amount of work we just popped.

@dgrisham
Copy link
Member Author

Yeah, workRemaining makes sense. Updated PopTasks() as well, version with all of the discussed changes is here: https://github.com/dgrisham/go-peertaskqueue/blob/feat/peer-priority-lanes/peertaskqueue.go.

@wolneykien

This comment has been minimized.

@Stebalien

This comment has been minimized.

@dgrisham
Copy link
Member Author

dgrisham commented Jul 6, 2020

Implemented tests for this a little bit ago, thought I'd post here to get any thoughts on more tests. The implementation itself is straightforward, so the tests seem sufficient to me, but I'm happy to get any other thoughts! They're the TestPeerWeight* tests here.

@Stebalien

@Stebalien
Copy link
Member

@dirkmc do you have bandwidth to review this? Otherwise, cc @aschmahmann.

@dirkmc
Copy link
Contributor

dirkmc commented Jul 6, 2020

@dgrisham just to clarify, what would you like review feedback on? Just the tests? Or do you have a set of changes in a PR that you'd like reviewed?

@dgrisham
Copy link
Member Author

dgrisham commented Jul 7, 2020 via email

@dirkmc
Copy link
Contributor

dirkmc commented Jul 7, 2020

Overall the tests look good to me.

I'd suggest that rather than using random numbers of peers and block sizes, you select a few that test important values (eg boundaries) and run the tests with those. That way the tests will be deterministic. Otherwise tests might pass now but suddenly fail later and it won't be clear why.

For example:

testPeerWeightsSingleRoundConstantWeights(t *testing.T, numPeers int, blockSize int) { ... }

TestPeerWeightsSingleRoundConstantWeights(t *testing.T) {
  numPeers := []int{1,2,4,8,16}
  blockSizes := []int{1,2,4,8,16}
  for _, np := range numPeers {
    for _, sz := range blockSizes {
      testPeerWeightsSingleRoundConstantWeights(t, np, sz)
    }
  }
}

@Jorropo Jorropo changed the title Bitswap Strategy Implementation [ipfs/go-bitswap] Bitswap Strategy Implementation Jan 27, 2023
@welcome
Copy link

welcome bot commented Jan 27, 2023

Thank you for submitting your first issue to this repository! A maintainer will be here shortly to triage and review.
In the meantime, please double-check that you have provided all the necessary information to make this process easy! Any information that can help save additional round trips is useful! We currently aim to give initial feedback within two business days. If this does not happen, feel free to leave a comment.
Please keep an eye on how this issue will be labeled, as labels give an overview of priorities, assignments and additional actions requested by the maintainers:

  • "Priority" labels will show how urgent this is for the team.
  • "Status" labels will show if this is ready to be worked on, blocked, or in progress.
  • "Need" labels will indicate if additional input or analysis is required.

Finally, remember to use https://discuss.ipfs.io if you just need general support.

@Jorropo Jorropo transferred this issue from ipfs/go-bitswap Jan 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants