Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance of Blockstore.AllKeysChan() #3376

Closed
kevina opened this issue Nov 13, 2016 · 7 comments
Closed

Improve performance of Blockstore.AllKeysChan() #3376

kevina opened this issue Nov 13, 2016 · 7 comments
Labels
topic/perf Performance

Comments

@kevina
Copy link
Contributor

kevina commented Nov 13, 2016

@whyrusleeping asked me to look into improving the performance of Blockstore.AllKeysChan. I was able to improve performance by around an order by implementing a series of fairly simple fixes.

Initial results for around 100,000 small (1000 byte) blocks: 3095.821125 ± 98.089091 ms

After fixing channel buffer size in namespace datastore (ipfs/go-datastore#52): 1428.792103 ± 37.367283 ms

After removing expensive debug reporting in the blockstore (#3378): 529.585905 ± 18.348316 ms

After using the optimized Walk in flatfs (ipfs/go-ds-flatfs#3): 355.931681 ± 7.984393 ms

Avoid doing unnecessary work in conversion to Cid (#3379): 357.190799 ± 9.451660 ms

When measuring real time and that is about all the improvements I can get on a dual core (with hyper-threading making 4 virtual cores) system. (I am also getting 265% CPU time on the last two runs so it could also be the hyper-threading in this case is slowing things down.) So I reran the last two tests while pinning the process to a single CPU (using taskset on linux).

After using the optimized Walk in flatfs (ipfs/go-ds-flatfs#3): 520.779686 ± 2.332627 ms

Avoid doing unnecessary work in conversion to Cid (#3379): 494.916185 ± 9.780639 ms

So the last optimization has a small but measurable improvement on a single core system.

Possible future improvements

The changes for the performance boosts so far where self contained and hopefully be uncontroversial. Although not part of this issue further improvement is likely possible. Here some additional results that measured the theoretical performance gains we can still get, but will likely involve some API changes.

Single CPU Results:

Results from configuring the blockstore to query the flatfs datastore directly: 295.838959 ± 4.161743 ms

Results from directly querying the Datastore directly and converting the key to Cid: 255.059754 ± 5.210151 ms

Results from directly querying the Datastore directly with no conversion to Cid: 151.604514 ± 1.888088 ms

This seams to imply the conversion to Cid takes up a lot of CPU time, but see below.

Multi CPU Results:

Results from configuring the blockstore to query the flatfs datastore directly: 208.996977 ± 10.234978 ms

Results from directly querying the Datastore directly and converting the key to Cid: 190.639903 ± 10.407054 ms

Results from directly querying the Datastore directly with no conversion to Cid: 185.266868 ± 8.534578 ms

The Conversion to Cid on Cid on a multi-core system may take a lot of CPU time, but does not effect the overall runtime thanks to the use of goroutines.

@kevina kevina changed the title WIP: Improving performance of Blockstore.AllKeysChan() report Improving performance of Blockstore.AllKeysChan() report Nov 14, 2016
@kevina kevina changed the title Improving performance of Blockstore.AllKeysChan() report Improve performance of Blockstore.AllKeysChan() report Nov 14, 2016
@kevina kevina changed the title Improve performance of Blockstore.AllKeysChan() report Improve performance of Blockstore.AllKeysChan() Nov 14, 2016
@kevina kevina added this to the Improve Performance milestone Nov 14, 2016
@kevina kevina added the topic/perf Performance label Nov 14, 2016
@kevina
Copy link
Contributor Author

kevina commented Nov 16, 2016

All of my proposed changes have been merged. With some additional work I was able to improve the time even more:

Improvement Time Multi Core Time Single CPU
Original 340 490
Make sure the channel buffer size is 128 for all channels in the pipeline ipfs/go-datastore#55 245 465
Avoid unnecessary strings in datastore key manipulation ipfs/go-datastore#56 205 320
(if the blockstore was configured to query the flatfs datastore directly) 200 290

(Note that due to a different load on my system the numbers are slightly different from the previous run.)

@whyrusleeping
Copy link
Member

@kevina so the final result will be from roughly 3000ms to 205ms ?

@kevina
Copy link
Contributor Author

kevina commented Nov 16, 2016

@whyrusleeping Yes, More or Less.

To get a really accurate number I would need to rebuild the initial binary and rerun the benchmark code for the original and final several times to normalize the effect of the load on my system.

So the speed up is well over an order of magnate and somewhere between 14 and 15 times.

@whyrusleeping
Copy link
Member

@kevina thats awesome :)

@kevina
Copy link
Contributor Author

kevina commented Nov 17, 2016

@whyrusleeping Thanks, once ipfs/go-datastore#56 is merged and the dependencies are updated than I will consider this issue done and we can close this.

Not using a channel for datastores queries and other possible improvements can be dealt with separately.

@Kubuxu
Copy link
Member

Kubuxu commented Nov 19, 2016

Do you think that batching keys in the call (sending slices instead of keys) would help?

@whyrusleeping whyrusleeping added the status/ready Ready to be worked label Nov 28, 2016
@kevina
Copy link
Contributor Author

kevina commented Dec 5, 2016

All my changes are now in. I consider this done. Closing.

@kevina kevina closed this as completed Dec 5, 2016
@kevina kevina removed the status/ready Ready to be worked label Dec 5, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic/perf Performance
Projects
None yet
Development

No branches or pull requests

4 participants
@whyrusleeping @kevina @Kubuxu and others