[Haskell-begin] Maximum of a List?

Braden Shepherdson Braden.Shepherdson at gmail.com
Sat Jul 26 22:45:05 EDT 2008


Steve Klabnik wrote:
> Hello everyone-
> 
> I'm working on some project Euler problems today, and I'm stuck on one. 
> It's not the problem itself that's the problem, it's that finding the 
> maximum of a list makes me run out of heap space!
> 
> http://projecteuler.net/index.php?section=problems&id=14 
> <http://projecteuler.net/index.php?section=problems&id=14>
> 
> my code:
> 
> import Data.List
> 
> f :: Int -> Int -> Int
> f acc x
>     | x == 1 = acc
>         | even x = f (acc + 1) (x `div` 2)
>         | otherwise = f (acc + 1) (3 * x + 1)
> 
> answer = (foldl' max 0) $ map (f 1) [1 .. 999999]
> 
> I tried using 'foldl' max ' instead of 'maximum' because I thought that 
> foldl' was supposed to work better than foldl or something...I could be 
> confused on that point. Anyway, here's what I get...
> 
>   / _ \ /\  /\/ __(_)
>  / /_\// /_/ / /  | |      GHC Interactive, version 6.4, for Haskell 98.
> / /_\\/ __  / /___| |      http://www.haskell.org/ghc/
> \____/\/ /_/\____/|_|      Type :? for help.
> 
> Loading package base-1.0 ... linking ... done.
> Prelude> :l ..\test.hs
> Compiling Main             ( ..\test.hs, interpreted )
> Ok, modules loaded: Main.
> *Main> answer
> GHC's heap exhausted: current limit is 268435456 bytes;
> Use the `-M<size>' option to increase the total heap size.
> 
> I also tried writing a simple 'max' function that I thought was tail 
> recursive, and that that would help. Same error. I'm finding it hard to 
> believe that finding a maximum of a list of a million small integers 
> would cause this kind of overflow...is it really that big of a problem?

Hello Steve,

First, is there a reason to use such an old GHC? Newer versions produce 
much better code.

That said, certainly 6.4 is sufficient for this problem. foldl' is the 
right choice here, it is strict in the accumulating parameter and that 
is what you want. The problem is the two parameters in f. When you make 
the recursive call 'f (acc + 1) ...', a thunk is being created to hold 
that calculation (acc + 1). It isn't being evaluated right away, so the 
thunk remains. On the next iteration, acc is incremented again. You are 
effectively building up the expression 'acc + 1 + 1 + 1 + 1 ...', 
perhaps quite deeply. This takes up heap space; apparently too much 
space. Similarly, the (x `div` 2) and (3 * x + 1) calculations are being 
suspended as thunks, though x gets forced on the next call, since it is 
tested by the guards.

Adding the pragma
{-# LANGUAGE BangPatterns #-} at the top of the file, you could then 
annotate acc:

f !acc x

this seems to run in constant space for me.

Note that there is a much more concise way to write f, though I doubt it 
would be much faster. Consider 'iterate' and 'takeWhile'.

Finally, compiling to a binary with ghc -O2 will yield much better speed 
than ghci. And GHC 6.8.3 would provide a further 20%+ speed boost, if 
installing it is an option on your environment.


Hope this helps,

Braden Shepherdson
shepheb



More information about the Beginners mailing list