Performance Testing #776

benjchristensen · 2014-01-22T04:18:24Z

I would like to integrate performance testing as a first-class aspect of rxjava-core in https://github.com/Netflix/RxJava/tree/master/rxjava-core/src/perf

One option is Google Caliper: https://code.google.com/p/caliper/
Another is JMH: http://openjdk.java.net/projects/code-tools/jmh/

Of potential interest, Netty uses JMH: http://netty.io/wiki/microbench-module.html

I have placed some very simple, manual performance tests in the /src/perf folders for now but I'd like to establish the tooling and a few solid examples so we have a pattern to follow.

benjchristensen · 2014-01-22T04:18:49Z

/cc @abersnaze as you've been involved in these discussions and you're researching Google Caliper.

gvsmirnov · 2014-01-31T23:27:20Z

I would very much recommend using JMH, and not Caliper. The latter has lots and lots of issues, which are addressed in the former. Here's a great presentation about it.

benjchristensen · 2014-01-31T23:54:54Z

Thank you for weighing in and sharing that presentation, just read through it, very interesting. Can you point to anything about the issues with Caliper?

headinthebox · 2014-02-01T07:22:01Z

@gvsmirnov JMH looks technically pretty impressive, but seems not to integrate as nicely as Caliper in an IDE workflow. I could only find some very brief comments about IntelliJ integration on the Web, do you know more. Also, as @benjchristensen says, the presentation is super interesting but does not answer the question why Caliper is not a good choice.

A side question about all his benchmarking stuff is how much it relates to performance in production. i.e. when running the benchmarks, you measure things in a very specific way, but in production it runs in a completely different environment. It sometimes feels to me like measuring calories using a http://en.wikipedia.org/wiki/Calorimeter, which does not really correspond to the actual digestion of food. To try to state it more formally, is benchmarking monotonic, in other words does Benchmark(A) < Benchmark(B) imply that InProduction(A) < InProduction(B)?

gvsmirnov · 2014-02-01T08:25:46Z

@benjchristensen Unfortunately, there is no article/presentation/whatever which explicitly points out all the pitfalls of Caliper that I know of. But for most of the common problems (outlined in the presentation), Caliper has no built-in means to work around (the last time I checked, at least). The most broken thing about Caliper is that it falls victim to loop unrolling. See here.

JMH is all about taking the trouble off our shoulders, especially the trouble we do not even suspect exists. Many things that are hard to implement in Caliper (like this and that and that) are easy to do in JMH.

@headinthebox Now, regarding IDE support, there is indeed next to no of it. But I personally hardly ever use IDE for things like running tests or working with VCSs. Command-line utilities work fine for me. And for JMH, they are much better that your average CLI tool.

gvsmirnov · 2014-02-01T09:03:41Z

I have just started a mechanical-sympathy thread that discusses this subject. There will probably be a lot of info there in a couple of days.

benjchristensen · 2014-02-05T18:08:06Z

Thank you @gvsmirnov for the information. This is something I hope we'll make a first-class aspect of RxJava in the near future and your information will really help.

Are you interested in helping us bootstrap RxJava with JMH? The rxjava-core/src/perf/ code is wide-open right now to setup correctly.

gvsmirnov · 2014-02-05T21:29:40Z

@benjchristensen I most definitely am. There are some spare time issues at the moment, though, so I don't think I will be able to contribute for a couple of weeks. Afterwards, I would be happy to.

benjchristensen · 2014-02-05T22:15:24Z

I understand that problem! Once you have some time I'd appreciate your help to get us started down the right path.

abersnaze · 2014-02-07T09:07:44Z

Some observations on the difference now that I've actually used both of them:

Caliper
PROS

It also measures object count+memory usage as well as time.
Makes it clear that is monitoring JIT and GC events during the timing.
parameter annotations makes easier to test different configurations without having to generate a method for each combination manually.

CONS

Warm up is a bit a black box. I've seen the warnings that it has detect JIT during measurement often enough that it makes me think that it isn't doing enough to warm up the code.
It uploads the results!

P.S. I'm not an expert in either benchmarking tool.

gvsmirnov · 2014-03-07T12:29:14Z

@benjchristensen Sorry it took me so long, but I'm finally back. I've thrown together a sample gradle project with JMH support here. Hoping to integrate it with RxJava real soon.

gvsmirnov · 2014-03-13T20:39:28Z

Oh, finally! I have sent a pull request (#963) with the updated JMH benchmarking. It features changes both to the gradle setup, and to the benchmark itself.

The gradle set up us explained in this blog post.

The benchmark is changed in such a way that prevents most of the caveats (like DCE) from happening, while also ensuring that more accurate results are attained. Please consult these samples to gain deeper insight into how benchmarking should be done with JMH.

Here are the results that I got on my Haswell 2.6 GHz 16 GB RAM laptop with Java 8:

Benchmark                                  (size)   Mode   Samples         Mean   Mean error    Units
r.o.ObservableBenchmark.measureBaseline         1   avgt        10        0.003        0.000    us/op
r.o.ObservableBenchmark.measureBaseline      1024   avgt        10        2.764        0.051    us/op
r.o.ObservableBenchmark.measureBaseline   1048576   avgt        10     3104.088       49.586    us/op
r.o.ObservableBenchmark.measureMap              1   avgt        10        0.100        0.003    us/op
r.o.ObservableBenchmark.measureMap           1024   avgt        10        5.036        0.059    us/op
r.o.ObservableBenchmark.measureMap        1048576   avgt        10     6693.271      277.604    us/op

What we see here is that doing nothing RxJava introduces about a 2x overhead in latency compared to simply doing nothing. Pretty acceptable if you ask me.

benjchristensen · 2014-03-13T22:45:43Z

This is great @gvsmirnov Thank you!

Is there a way to maintain historical snapshots over time for getting performance diffs?

gvsmirnov · 2014-03-14T20:41:50Z

@benjchristensen you're very welcome.

Uh. I'm not exactly sure if there is an established practice with that. You can easily get JMH to output its results in csv, scsv or json. Should not be a long way from there.

What I'm doing is: before merging anything to master, run the benchmarks on master and on the branch. Works fine for me.

benjchristensen · 2014-05-20T06:44:37Z

We have JMH integrated and being used so closing this. Thank you @gvsmirnov for your help on this!

abersnaze mentioned this issue Jan 27, 2014

Performance benchmark with caliper #788

Merged

abersnaze mentioned this issue Feb 7, 2014

Perf with JMH #837

Merged

gvsmirnov mentioned this issue Mar 13, 2014

A more robust JMH benchmarking set-up #963

Merged

benjchristensen closed this as completed May 20, 2014

sainaen mentioned this issue May 8, 2015

set capacity for StringBuilder in String#repeat mozilla/rhino#188

Closed

This issue was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance Testing #776

Performance Testing #776

benjchristensen commented Jan 22, 2014

benjchristensen commented Jan 22, 2014

gvsmirnov commented Jan 31, 2014

benjchristensen commented Jan 31, 2014

headinthebox commented Feb 1, 2014

gvsmirnov commented Feb 1, 2014

gvsmirnov commented Feb 1, 2014

benjchristensen commented Feb 5, 2014

gvsmirnov commented Feb 5, 2014

benjchristensen commented Feb 5, 2014

abersnaze commented Feb 7, 2014

gvsmirnov commented Mar 7, 2014

gvsmirnov commented Mar 13, 2014

benjchristensen commented Mar 13, 2014

gvsmirnov commented Mar 14, 2014

benjchristensen commented May 20, 2014

Performance Testing #776

Performance Testing #776

Comments

benjchristensen commented Jan 22, 2014

benjchristensen commented Jan 22, 2014

gvsmirnov commented Jan 31, 2014

benjchristensen commented Jan 31, 2014

headinthebox commented Feb 1, 2014

gvsmirnov commented Feb 1, 2014

gvsmirnov commented Feb 1, 2014

benjchristensen commented Feb 5, 2014

gvsmirnov commented Feb 5, 2014

benjchristensen commented Feb 5, 2014

abersnaze commented Feb 7, 2014

gvsmirnov commented Mar 7, 2014

gvsmirnov commented Mar 13, 2014

benjchristensen commented Mar 13, 2014

gvsmirnov commented Mar 14, 2014

benjchristensen commented May 20, 2014