[Haskell-cafe] weak pointers and memoization (was Re: memoization)

Wed Sep 16 16:48:39 EDT 2009

How does garbage collection work in an example like the one below?  You
memoize a function with some sort of lookup table, which stores function
arguments as keys and function results as values.  As long as the
function remains in scope, the keys in the lookup table remain in
memory, which means that the keys themselves always remain reachable
and they cannot be garbage collected.  Right?

So what do you do in the case where you know that, after some period of
time, some entries in the lookup table will never be accessed?  That is,
there are no references to the keys for some entries remaining, except
for the references in the lookup table itself.  You'd like to allow the
memory occupied by the keys to be garbage collected.  Otherwise, if the
function stays around for a long time, the size of the lookup table
always grows.  How do you avoid the space leak?

I notice that there is a function in Data.IORef,

mkWeakIORef :: IORef a -> IO () -> IO (Weak (IORef a))

which looks promising.  In the code below, however, there's only one
IORef, so either the entire table gets garbage collected or none of it
does.

I've been reading the paper "Stretching the storage manager: weak
pointers and stable names in Haskell," which seems to answer my
question.  When I attempt to run the memoization code in the paper on
the simple fib example, I find that -- apparently due to lazy
evaluation -- no new entries are entered into the lookup table, and
therefore no lookups are ever successful!

So apparently there is some interaction between lazy evaluation and
garbage collection that I don't understand.  My head hurts.  Is it
necessary to make the table lookup operation strict?  Or is it
something entirely different that I am missing?

-Rod

On Thu, 10 Sep 2009 18:33:47 -0700
Ryan Ingram <ryani.spam at gmail.com> wrote:

> 
> memoIO :: Ord a => (a -> b) -> IO (a -> IO b)
> memoIO f = do
>    cache <- newIORef M.empty
>    return $ \x -> do
>        m <- readIORef cache
>        case M.lookup x m of
>            Just y -> return y
>            Nothing -> do let res = f x
>                          writeIORef cache $ M.insert x res m
>                          return res
> 
> memo :: Ord a => (a -> b) -> (a -> b)
> memo f = unsafePerformIO $ do
>     fmemo <- memoIO f
>     return (unsafePerformIO . fmemo)
> 
> I don't think there is any valid transformation that breaks this,
> since the compiler can't lift anything through unsafePerformIO.  Am I
> mistaken?
> 
>   -- ryan