monad binding and garbage collection

Christian Schuhegger [email protected]
Thu, 17 Jan 2002 10:26:40 +0100

hi all,

i'm new to haskell and i have a question on a general problem concerning
monad binding and garbage collection.

my question will be on how to approach this type of problem?

consider the following simple program:

buildIOList :: [a] -> IO [a]
buildIOList li = return li

main = do {li <- buildIOList [1..];
           putStr (show (take 100000000 li));

if i compile this program with ghc and run it it has constant memory
usage because of the lazy nature of haskell. on the other hand if i use
runhugs then i get:
ERROR - Garbage collection fails to reclaim sufficient space

i think this is because the compiler can examine the code and sees that
the binding li is not used after putStr any more and therefore the
garbage collector can safely collect all the list elements that were
already put onto the screen, but hugs as a interpreter does not have
this knowledge and therefore has to keep all the memory that the binding
li refers to.

if i modify this program a bit:
main = do {li <- buildIOList [1..];
           putStr (show (take 100000000 li));
           putStr (show (take 100000000 li));           

i get the same memory problem with a ghc compiled file, because now all
the cells in the list bound by li that are evaluated stay in the system,
because the gc cannot collect them as the third line in the do
expression uses li again.

with a little modification:

main = do {li <- buildIOList [1..];
           putStr (show (take 100000000 li));
           li <- buildIOList [1..];
           putStr (show (take 100000000 li));           

the program has again constant memory usage, because the name li is
bound again later in the program so the gc can collect the elements of
the first li.

again a modification would be:

buildIOList :: [a] -> [IO a]
buildIOList [] = []
buildIOList (hd : tl) = (return hd) : buildIOList tl

main = do {printString (buildIOList [1..]);
           printString (buildIOList [1..]);

printString :: Show a => [IO a] -> IO ()
printString [] = return ()
printString (hd : tl) = do {c <- hd;
                            putStr (show c);
                            printString tl}

this program has also constant memory usage, but here i have to modify a
lot to propagate the "little bits of monads" in the program.

my question is now how an experienced haskell programmer would attack
the problem? i would also like to have a solution that works for hugs so
that my code is independent of the environment it is run in.

many thanks for any suggestions!

Christian Schuhegger

 The most likely way for the world to be destroyed, most experts agree,
 is by accident. That's where we come in; we're computer professionals.
 We cause accidents.                           -- Nathaniel Borenstein