[Haskell-cafe] GHC predictability

Spencer Janssen sjanssen at cse.unl.edu
Mon May 12 23:40:21 EDT 2008


On Mon, May 12, 2008 at 08:01:53PM +0100, Andrew Coppin wrote:
> I offer up the following example:
>
>  mean xs = sum xs / length xs
>
> Now try, say, "mean [1.. 1e9]", and watch GHC eat several GB of RAM. (!!)

I don't see why the performance implications of this program are surprising.
Just ask any programmer used to a strict language how much memory "[1 .. 1e9]"
will require.

> If we now rearrange this to
>
>  mean = (\(s,n) -> s / n) . foldr (\x (s,n) -> let s' = s+x; n' = n+1 in 
> s' `seq` n' `seq` (s', n')) (0,0)
>
> and run the same example, and watch it run in constant space.

This will use linear stack space.  You probably meant to use foldl'?

Better:

    mean = uncurry (/) . foldl' f (0, 0)
     where f (!s, !n) x = (s + x, n + 1)

           -- or, if you require Haskell '98:
           f (s, n) x = s `seq` n `seq` (s + x, n + 1)

This version is very legible in my opinion.  In fact, the algorithm is
identical to what I'd write in C.  Also, "mean [1 .. 1e9]" will actually work
in Haskell, while in C you'll just run out of memory.


Cheers,
Spencer Janssen


More information about the Haskell-Cafe mailing list