[Haskell-cafe] How to correctly benchmark code with Criterion?

Fri Oct 19 09:24:26 CEST 2012

Thank you very much Thomas. This is the kind of explanation I needed!

Janek

Dnia czwartek, 18 października 2012, Thomas Schilling napisał:
> On 18 October 2012 13:15, Janek S. <fremenzone at poczta.onet.pl> wrote:
> >> Something like this might work, not sure what the canonical way is.
> >> (...)
> >
> > This is basically the same as the answer I was given on SO. My concerns
> > about this solutions are: - rnf requires its parameter to belong to
> > NFData type class. This is not the case for some data structures like
> > Repa arrays.
>
> For unboxed arrays of primitive types WHNF = NF.  That is, once the
> array is constructed all its elements will be in WHNF.
>
> > - evaluate only evaluates its argument to WHNF - is this enough? If I
> > have a tuple containing two lists won't this only evaluate the tuple
> > construtor and leave the lists as thunks? This is actually the case in my
> > code.
>
> That is why you use "rnf" from the NFData type class. You use
> "evaluate" to kick-start rnf which then goes ahead and evaluates
> everything (assuming the NFData instance has been defined correctly.)
>
> > As I said previously, it seems that Criterion somehow evaluates the data
> > so that time needed for its creation is not included in the benchmark. I
> > modified my dataBuild function to look lik this:
> >
> > dataBuild gen = unsafePerformIO $ do
> >     let x = (take 6 $ randoms gen, take 2048 $ randoms gen)
> >     delayThread 1000000
> >     return x
> >
> > When I ran the benchmark, criterion estimated the time needed to complete
> > it to over 100 seconds (which means that delayThread worked and was used
> > as a basis for estimation), but the benchamrk was finished much faster
> > and there was no difference in the final result comparing to the normal
> > dataBuild function. This suggests that once data was created and used for
> > estimation, the dataBuild function was not used again. The main question
> > is: is this observation correct? In this question on SO:
> > http://stackoverflow.com/questions/6637968/how-to-use-criterion-to-measur
> >e-performance-of-haskell-programs one of the aswers says that there is no
> > automatic memoization, while it looks that in fact the values of
> > dataBuild are memoized. I have a feeling that I am misunderstanding
> > something.
>
> If you bind an expression to a variable and then reuse that variable,
> the expression is only evaluated once. That is, in "let x = expr in
> ..." the expression is only evaluated once. However, if you have "f y
> = let x = expr in ..." then the expression is evaluated once per
> function call.
>
> >> I don't know if you have already read them,
> >> but Tibell's slides on High Performance Haskell are pretty good:
> >>
> >> http://www.slideshare.net/tibbe/highperformance-haskell
> >>
> >> There is a section at the end where he runs several tests using
> >> Criterion.
> >
> > I skimmed the slides and slide 59 seems to show that my concerns
> > regarding WHNF might be true.
>
> It's usually safe if you benchmark a function. However, you most
> likely want the result to be in normal form.  The "nf" does this for
> you. So, if your benchmark function has type "f :: X -> ([Double],
> Double)", your benchmark will be:
>
>   bench "f" (nf f input)
>
> The first run will evaluate the input (and discard the runtime) and
> all subsequent runs will evaluate the result to normal form. For repa
> you can use deepSeqArray [1] if your array is not unboxed:
>
>   bench "f'" (whnf (deepSeqArray . f) input)
>
> [1]:
> http://hackage.haskell.org/packages/archive/repa/3.2.2.2/doc/html/Data-Arra
>y-Repa.html#v:deepSeqArray