[Haskell-cafe] Re: Mining Twitter data in Haskell and Clojure

Don Stewart dons at galois.com
Thu Jun 24 12:18:08 EDT 2010


marlowsd:
>> I'll work with Simon to investigate the runtime, but would welcome any
>> ideas on further speeding up cafe4.
>
> An update on this: with the help of Alex I tracked down the problem (an  
> integer overflow bug in GHC's memory allocator), and his program now  
> runs to completion.
>
> This is the largest program (in terms of memory requirements) I've ever  
> seen anyone run using GHC.  In fact there was no machine in our building  
> capable of running it, I had to fire up the largest Amazon EC2 instance  
> available (68GB) to debug it - this bug cost me $26.  Here are the stats  
> from the working program:
>
>  392,908,177,040 bytes allocated in the heap
>  174,455,211,920 bytes copied during GC
>   24,151,940,568 bytes maximum residency (6 sample(s))
>   36,857,590,520 bytes maximum slop
>            64029 MB total memory in use (1000 MB lost due to fragmentation)
>
>   Generation 0:    62 collections,     0 parallel, 352.35s, 357.13s elapsed
>   Generation 1:     6 collections,     0 parallel, 180.63s, 209.19s elapsed
>
>   INIT  time    0.00s  (  0.11s elapsed)
>   MUT   time  1201.47s  (1294.29s elapsed)
>   GC    time  532.98s  (566.33s elapsed)
>   EXIT  time    0.00s  (  5.34s elapsed)
>   Total time  1734.46s  (1860.74s elapsed)
>
>   %GC time      30.7%  (30.4% elapsed)
>
>   Alloc rate    327,020,156 bytes per MUT second
>
>   Productivity  69.3% of total user, 64.6% of total elapsed

Well done!!


More information about the Haskell-Cafe mailing list