Consider RegressionTests.jl and Chairmarks.jl for benchmarking #753

dmbates · 2024-03-10T16:36:22Z

Initially this branch just provides a bench/runbenchmarks.jl that uses Chairmarks.jl
The method is to construct a table of dataset names, formulas and number of seconds to run the benchmark for that combination. This design is preliminary.
Eventually we may consider using RegressionTests.jl in CI but that package seems best suited to micro-benchmarks.

dmbates · 2024-03-10T16:40:37Z

On an M1 Macbook Pro the results were

Table with 3 columns and 19 rows:
      dsnm        secs  frm
    ┌─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
 1  │ dyestuff2   0.1   yield ~ 1 + :(1 | batch)
 2  │ dyestuff    0.1   yield ~ 1 + :(1 | batch)
 3  │ machines    0.1   score ~ 1 + :(1 | Worker) + :(1 | Machine)
 4  │ pastes      0.1   strength ~ 1 + :(1 | batch & cask)
 5  │ pastes      0.1   strength ~ 1 + :(1 | batch / cask)
 6  │ penicillin  0.1   diameter ~ 1 + :(1 | plate) + :(1 | sample)
 7  │ sleepstudy  0.1   reaction ~ 1 + days + :(1 | subj)
 8  │ sleepstudy  0.1   reaction ~ 1 + days + :(zerocorr((1 + days) | subj))
 9  │ sleepstudy  0.1   reaction ~ 1 + days + :(1 | subj) + :((0 + days) | subj)
 10 │ sleepstudy  0.1   reaction ~ 1 + days + :((1 + days) | subj)
 11 │ kb07        0.1   :(log(rt_trunc)) ~ 1 + spkr + prec + load + :(1 | subj) + :(1 | item)
 12 │ kb07        0.1   :(log(rt_trunc)) ~ 1 + spkr + prec + load + spkr & prec + spkr & load + prec & load + spkr & prec & load + :(1 | …
 13 │ mrk17_exp1  1.0   :(1000 / rt) ~ 1 + F + P + Q + lQ + lT + F & P + F & Q + P & Q + F & lQ + P & lQ + Q & lQ + F & lT + P & lT + Q &…
 14 │ insteval    5.0   y ~ 1 + service + dept + service & dept + :(1 | s) + :(1 | d)
 15 │ insteval    5.0   y ~ 1 + service + :(1 | s) + :(1 | d) + :(1 | dept)
 16 │ kb07        5.0   :(log(rt_trunc)) ~ 1 + spkr + prec + load + spkr & prec + spkr & load + prec & load + spkr & prec & load + :((1 +…
 17 │ mrk17_exp1  25.0  :(1000 / rt) ~ 1 + F + P + Q + lQ + lT + F & P + F & Q + P & Q + F & lQ + P & lQ + Q & lQ + F & lT + P & lT + Q &…
 18 │ d3          25.0  y ~ 1 + u + :((1 + u) | g) + :((1 + u) | h) + :((1 + u) | i)
 19 │ ml1m        25.0  y ~ 1 + :(1 | g) + :(1 | h)

julia> res = runbmrk(tbl)
Table with 3 columns and 19 rows:
      dsnm        bmk                                                          frm
    ┌─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
 1  │ dyestuff2   Sample(time=8.2583e-5, allocs=1039, bytes=53040)             yield ~ 1 + :(1 | batch)
 2  │ dyestuff    Sample(time=8.3209e-5, allocs=1044, bytes=53136)             yield ~ 1 + :(1 | batch)
 3  │ machines    Sample(time=0.000216375, allocs=2029, bytes=102688)          score ~ 1 + :(1 | Worker) + :(1 | Machine)
 4  │ pastes      Sample(time=0.000117083, allocs=1477, bytes=91336)           strength ~ 1 + :(1 | batch & cask)
 5  │ pastes      Sample(time=0.000245833, allocs=2391, bytes=130592)          strength ~ 1 + :(1 | batch / cask)
 6  │ penicillin  Sample(time=0.000349334, allocs=2875, bytes=155280)          diameter ~ 1 + :(1 | plate) + :(1 | sample)
 7  │ sleepstudy  Sample(time=0.000109375, allocs=1187, bytes=95368)           reaction ~ 1 + days + :(1 | subj)
 8  │ sleepstudy  Sample(time=0.000215125, allocs=1753, bytes=128544)          reaction ~ 1 + days + :(zerocorr((1 + days) | subj))
 9  │ sleepstudy  Sample(time=0.000246875, allocs=2044, bytes=170896)          reaction ~ 1 + days + :(1 | subj) + :((0 + days) | subj)
 10 │ sleepstudy  Sample(time=0.000618667, allocs=2490, bytes=142272)          reaction ~ 1 + days + :((1 + days) | subj)
 11 │ kb07        Sample(time=0.00159417, allocs=12691, bytes=1293768)         :(log(rt_trunc)) ~ 1 + spkr + prec + load + :(1 | subj) + …
 12 │ kb07        Sample(time=0.00517542, allocs=15926, bytes=2628872)         :(log(rt_trunc)) ~ 1 + spkr + prec + load + spkr & prec + …
 13 │ mrk17_exp1  Sample(time=0.0700094, allocs=161656, bytes=118467680, gc_…  :(1000 / rt) ~ 1 + F + P + Q + lQ + lT + F & P + F & Q + P…
 14 │ insteval    Sample(time=0.297278, allocs=299482, bytes=141469552, gc_f…  y ~ 1 + service + dept + service & dept + :(1 | s) + :(1 |…
 15 │ insteval    Sample(time=0.579619, allocs=303283, bytes=53943424)         y ~ 1 + service + :(1 | s) + :(1 | d) + :(1 | dept)
 16 │ kb07        Sample(time=0.146926, allocs=57548, bytes=5561244)           :(log(rt_trunc)) ~ 1 + spkr + prec + load + spkr & prec + …
 17 │ mrk17_exp1  Sample(time=4.40508, allocs=263701, bytes=153506900, gc_fr…  :(1000 / rt) ~ 1 + F + P + Q + lQ + lT + F & P + F & Q + P…
 18 │ d3          Sample(time=4.36566, allocs=732859, bytes=165737656, gc_fr…  y ~ 1 + u + :((1 + u) | g) + :((1 + u) | h) + :((1 + u) | …
 19 │ ml1m        Sample(time=11.3488, allocs=2004602, bytes=430842888, gc_f…  y ~ 1 + :(1 | g) + :(1 | h)

If you print the second column with MIME("text/plain") you get the compact version.

julia> res.bmk
19-element Vector{Chairmarks.Sample}:
 82.583 μs (1039 allocs: 51.797 KiB)
 83.209 μs (1044 allocs: 51.891 KiB)
 216.375 μs (2029 allocs: 100.281 KiB)
 117.083 μs (1477 allocs: 89.195 KiB)
 245.833 μs (2391 allocs: 127.531 KiB)
 349.334 μs (2875 allocs: 151.641 KiB)
 109.375 μs (1187 allocs: 93.133 KiB)
 215.125 μs (1753 allocs: 125.531 KiB)
 246.875 μs (2044 allocs: 166.891 KiB)
 618.667 μs (2490 allocs: 138.938 KiB)
 1.594 ms (12691 allocs: 1.234 MiB)
 5.175 ms (15926 allocs: 2.507 MiB)
 70.009 ms (161656 allocs: 112.980 MiB, 1.94% gc time)
 297.278 ms (299482 allocs: 134.916 MiB, 0.54% gc time)
 579.619 ms (303283 allocs: 51.444 MiB)
 146.926 ms (57548 allocs: 5.304 MiB)
 4.405 s (263701 allocs: 146.396 MiB, 0.08% gc time)
 4.366 s (732859 allocs: 158.060 MiB, 0.04% gc time)
 11.349 s (2004602 allocs: 410.884 MiB, 0.11% gc time)

I'm not sure how to get that version in the display of the table.

codecov · 2024-03-10T16:43:34Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 96.93%. Comparing base (34899cf) to head (c90f3ab).

Additional details and impacted files

@@           Coverage Diff           @@
##             main     #753   +/-   ##
=======================================
  Coverage   96.93%   96.93%           
=======================================
  Files          34       34           
  Lines        3358     3358           
=======================================
  Hits         3255     3255           
  Misses        103      103

Flag	Coverage Δ
current	`96.87% <ø> (ø)`
minimum	`96.83% <ø> (ø)`
nightly	`96.43% <ø> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

dmbates · 2024-03-10T17:46:40Z

Another data point

julia> res = runbmrk(tbl)
Table with 3 columns and 19 rows:
      bmk                                                                             dsnm        frm
    ┌────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
 1  │ Sample(time=9.4062e-5, allocs=1029, bytes=52176)                                dyestuff2   yield ~ 1 + :(1 | batch)
 2  │ Sample(time=9.5404e-5, allocs=1034, bytes=52272)                                dyestuff    yield ~ 1 + :(1 | batch)
 3  │ Sample(time=0.000273862, allocs=2165, bytes=104800)                             machines    score ~ 1 + :(1 | Worker) + :(1 | Machine)
 4  │ Sample(time=0.000146979, allocs=1466, bytes=90456)                              pastes      strength ~ 1 + :(1 | batch & cask)
 5  │ Sample(time=0.000284636, allocs=2371, bytes=128864)                             pastes      strength ~ 1 + :(1 | batch / cask)
 6  │ Sample(time=0.000392762, allocs=2855, bytes=153424)                             penicillin  diameter ~ 1 + :(1 | plate) + :(1 | sample)
 7  │ Sample(time=0.000123226, allocs=1151, bytes=93560)                              sleepstudy  reaction ~ 1 + days + :(1 | subj)
 8  │ Sample(time=0.000373249, allocs=1743, bytes=127104)                             sleepstudy  reaction ~ 1 + days + :(zerocorr((1 + days) | subj))
 9  │ Sample(time=0.000417766, allocs=2033, bytes=169472)                             sleepstudy  reaction ~ 1 + days + :(1 | subj) + :((0 + days) | subj)
 10 │ Sample(time=0.00109645, allocs=2480, bytes=140832)                              sleepstudy  reaction ~ 1 + days + :((1 + days) | subj)
 11 │ Sample(time=0.00163458, allocs=12461, bytes=1281736)                            kb07        :(log(rt_trunc)) ~ 1 + spkr + prec + load + :(1 | subj) + :(1 | item)
 12 │ Sample(time=0.00758549, allocs=16064, bytes=2632632)                            kb07        :(log(rt_trunc)) ~ 1 + spkr + prec + load + spkr & prec + spkr & load + prec & load + spkr & prec & load + :(1 | subj) + :((1 + prec) | item)
 13 │ Sample(time=0.0767897, allocs=159445, bytes=118266464, gc_fraction=0.00927004)  mrk17_exp1  :(1000 / rt) ~ 1 + F + P + Q + lQ + lT + F & P + F & Q + P & Q + F & lQ + P & lQ + Q & lQ + F & lT + P & lT + Q & lT + lQ & lT + F & P & Q + F & P & lQ + F & Q & lQ + P & Q &…
 14 │ Sample(time=0.389135, allocs=299479, bytes=141466336, gc_fraction=0.00219119)   insteval    y ~ 1 + service + dept + service & dept + :(1 | s) + :(1 | d)
 15 │ Sample(time=0.797134, allocs=303602, bytes=53942976)                            insteval    y ~ 1 + service + :(1 | s) + :(1 | d) + :(1 | dept)
 16 │ Sample(time=0.188412, allocs=52410, bytes=5451148)                              kb07        :(log(rt_trunc)) ~ 1 + spkr + prec + load + spkr & prec + spkr & load + prec & load + spkr & prec & load + :((1 + spkr + prec + load) | subj) + :((1 + spkr + prec + load) | i…
 17 │ Sample(time=5.92026, allocs=281198, bytes=153694324, gc_fraction=0.000424185)   mrk17_exp1  :(1000 / rt) ~ 1 + F + P + Q + lQ + lT + F & P + F & Q + P & Q + F & lQ + P & lQ + Q & lQ + F & lT + P & lT + Q & lT + lQ & lT + F & P & Q + F & P & lQ + F & Q & lQ + P & Q &…
 18 │ Sample(time=13.6477, allocs=733319, bytes=165914536, gc_fraction=0.000517121)   d3          y ~ 1 + u + :((1 + u) | g) + :((1 + u) | h) + :((1 + u) | i)
 19 │ Sample(time=19.5966, allocs=2004582, bytes=430835656, gc_fraction=0.00134264)   ml1m        y ~ 1 + :(1 | g) + :(1 | h)

julia> versioninfo()
Julia Version 1.10.2
Commit bd47eca2c8a (2024-03-01 10:14 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 8 × 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, tigerlake)
Threads: 8 default, 0 interactive, 4 GC (on 8 virtual cores)

dmbates · 2024-06-25T19:20:14Z

It turns out that using RegressionTests.jl and @track to check for changes in benchmark runs takes a very long time and I don't think it is worth the cost. It is not terribly interesting to determine if there is a small decrease in time to fit a simple model and the methodology of RegressionTests.jl is not well-suited to comparing fitting speed on complex models.

Add bench/runbenchmarks.jl

58fa323

dmbates marked this pull request as draft March 10, 2024 16:36

dmbates and others added 5 commits March 11, 2024 11:27

Add benchmarks for GLMM fits

a864f30

Merge branch 'main' into db/RegressionTests

da3647c

Merge branch 'main' into db/RegressionTests

45918c0

Add calls to @track

63d15a1

Merge branch 'main' into db/RegressionTests

c90f3ab

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consider RegressionTests.jl and Chairmarks.jl for benchmarking #753

Consider RegressionTests.jl and Chairmarks.jl for benchmarking #753

dmbates commented Mar 10, 2024

dmbates commented Mar 10, 2024

codecov bot commented Mar 10, 2024 •

edited

Loading

dmbates commented Mar 10, 2024

dmbates commented Jun 25, 2024

Consider RegressionTests.jl and Chairmarks.jl for benchmarking #753

Are you sure you want to change the base?

Consider RegressionTests.jl and Chairmarks.jl for benchmarking #753

Conversation

dmbates commented Mar 10, 2024

dmbates commented Mar 10, 2024

codecov bot commented Mar 10, 2024 • edited Loading

Codecov Report

dmbates commented Mar 10, 2024

dmbates commented Jun 25, 2024

codecov bot commented Mar 10, 2024 •

edited

Loading