[Haskell-cafe] weak pointers and memoization (was Re: memoization)

Fri Sep 18 07:56:12 EDT 2009

I would also like to see a solution for problems like these.

Haskell provides a lot of nice memoizing / caching data structures -
like a trie - but the ones I know indeed keep growing, so no garbage
collection takes place?

It would be nice to have a data structure that performs caching but
does not grow unlimited.

I had a similar problem with stable names; it is not possible to check
if a stable name is still "alive".

On Fri, Sep 18, 2009 at 1:39 AM, Rodney Price <rodprice at raytheon.com> wrote:
> In my case, the results of each computation are used to generate a node
> in a graph structure (dag).  The key, oddly, is a hash of a two-tuple
> that gets stored in the data structure after the computation of the
> node finishes.  If I don't memoize the function to build a node, the
> cost of generating the tree is exponential; if I do, it's somewhere
> between linear and quadratic.
>
> Another process prunes parts of this graph structure as time goes on.
> The entire data structure is intended to be persistent, lasting for
> days at a time in a server-like application.  If the parts pruned
> aren't garbage collected, the space leak will eventually be
> catastrophic.  Either the memo table or the graph structure itself will
> outgrow available memory.
>
> -Rod
>
>
> On Thu, 17 Sep 2009 13:32:13 -0400
> Job Vranish <jvranish at gmail.com> wrote:
>
>> What are you trying to use this for? It seems to me that for memo
>> tables you almost never have references to they keys outside the
>> lookup table since the keys are usually computed right at the last
>> minute, and then discarded (otherwise it might be easier to just
>> cache stuff outside the function).
>>
>> For example with a naive fibs, the values you are passing in are
>> computed, and probably don't exist before you do the recursive call,
>> and then are discarded shortly afterward.
>>
>> It seems like putting a cap on the cache size, and then just
>> overwriting old entries would be better.
>> Am I missing something?
>>
>> - Job
>>
>>
>>
>> On Wed, Sep 16, 2009 at 4:48 PM, Rodney Price <rodprice at raytheon.com>
>> wrote:
>>
>> > How does garbage collection work in an example like the one below?
>> > You memoize a function with some sort of lookup table, which stores
>> > function arguments as keys and function results as values.  As long
>> > as the function remains in scope, the keys in the lookup table
>> > remain in memory, which means that the keys themselves always
>> > remain reachable and they cannot be garbage collected.  Right?
>> >
>> > So what do you do in the case where you know that, after some
>> > period of time, some entries in the lookup table will never be
>> > accessed?  That is, there are no references to the keys for some
>> > entries remaining, except for the references in the lookup table
>> > itself.  You'd like to allow the memory occupied by the keys to be
>> > garbage collected.  Otherwise, if the function stays around for a
>> > long time, the size of the lookup table always grows.  How do you
>> > avoid the space leak?
>> >
>> > I notice that there is a function in Data.IORef,
>> >
>> > mkWeakIORef :: IORef a -> IO () -> IO (Weak (IORef a))
>> >
>> > which looks promising.  In the code below, however, there's only one
>> > IORef, so either the entire table gets garbage collected or none of
>> > it does.
>> >
>> > I've been reading the paper "Stretching the storage manager: weak
>> > pointers and stable names in Haskell," which seems to answer my
>> > question.  When I attempt to run the memoization code in the paper
>> > on the simple fib example, I find that -- apparently due to lazy
>> > evaluation -- no new entries are entered into the lookup table, and
>> > therefore no lookups are ever successful!
>> >
>> > So apparently there is some interaction between lazy evaluation and
>> > garbage collection that I don't understand.  My head hurts.  Is it
>> > necessary to make the table lookup operation strict?  Or is it
>> > something entirely different that I am missing?
>> >
>> > -Rod
>> >
>> >
>> > On Thu, 10 Sep 2009 18:33:47 -0700
>> > Ryan Ingram <ryani.spam at gmail.com> wrote:
>> >
>> > >
>> > > memoIO :: Ord a => (a -> b) -> IO (a -> IO b)
>> > > memoIO f = do
>> > >    cache <- newIORef M.empty
>> > >    return $ \x -> do
>> > >        m <- readIORef cache
>> > >        case M.lookup x m of
>> > >            Just y -> return y
>> > >            Nothing -> do let res = f x
>> > >                          writeIORef cache $ M.insert x res m
>> > >                          return res
>> > >
>> > > memo :: Ord a => (a -> b) -> (a -> b)
>> > > memo f = unsafePerformIO $ do
>> > >     fmemo <- memoIO f
>> > >     return (unsafePerformIO . fmemo)
>> > >
>> > > I don't think there is any valid transformation that breaks this,
>> > > since the compiler can't lift anything through unsafePerformIO.
>> > > Am I mistaken?
>> > >
>> > >   -- ryan
>> >
>> > _______________________________________________
>> > Haskell-Cafe mailing list
>> > Haskell-Cafe at haskell.org
>> > http://www.haskell.org/mailman/listinfo/haskell-cafe
>> >
>
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>