[Haskell-cafe] Benchmarking and Garbage Collection

Neil Brown nccb2 at kent.ac.uk
Thu Mar 4 14:35:32 EST 2010


Jesper Louis Andersen wrote:
> On Thu, Mar 4, 2010 at 7:16 PM, Neil Brown <nccb2 at kent.ac.uk> wrote:
>
>   
>> However, one thing I've found is that the libraries have noticeably
>> different behaviour in terms of the amount of garbage created.
>>     
>
> In fact, CML relies on the garbage collector for some implementation
> constructions. John H. Reppys "Concurrent Programming in ML" is worth
> a read if you haven't. My guess is that the Haskell implementation of
> CML is bloody expensive. It is based on the paper
> http://www.cs.umd.edu/~avik/projects/cmllch/ where Chaudhuri first
> constructs an abstract machine for CML and then binds this to the
> Haskell MVar and forkIO constructions.
>   
CML is indeed the library that has the most markedly different 
behaviour.  In Haskell, the CML package manages to produce timings like 
this for fairly simple benchmarks:

  INIT  time    0.00s  (  0.00s elapsed)
  MUT   time    2.47s  (  2.49s elapsed)
  GC    time   59.43s  ( 60.56s elapsed)
  EXIT  time    0.00s  (  0.01s elapsed)
  Total time   61.68s  ( 63.07s elapsed)

  %GC time      96.3%  (96.0% elapsed)

  Alloc rate    784,401,525 bytes per MUT second

  Productivity   3.7% of total user, 3.6% of total elapsed

I knew from reading the code that CML's implementation would do 
something like this, although I do wonder if it triggers some 
pathological case in the GC.  The problem is that when I benchmark the 
program, it seems to finish it decent time; then spends 60 seconds doing 
GC before finally terminating!  So I need some way of timing that will 
reflect this; I wonder if just timing the entire run-time (and making 
the benchmarks long enough to not be swallowed by program start-up 
times, etc) is the best thing to do.  A secondary issue is whether I 
should even include CML at all considering the timings!

Thanks,

Neil



More information about the Haskell-Cafe mailing list