[Haskell-cafe] weak pointers and memoization (was Re: memoization)

Job Vranish jvranish at gmail.com
Fri Sep 18 11:24:37 EDT 2009


Hey it works :D
Here is a proof of concept:
http://gist.github.com/189104

Maybe later today I'll try to make a version that can be safely used outside
IO.

- Job


On Fri, Sep 18, 2009 at 10:19 AM, Job Vranish <jvranish at gmail.com> wrote:

> Yeah it seems like the general solution to the problem would be some sort
> of map-like datastructure that you add items via a key/value pair, and if
> the key gets GC'd, that entry gets removed from the structure.
>
> I've been wanting something like this as well, but didn't know about weak
> references so I didn't know if it was possible, but I think I could make
> something like this now. I'll give it a shot and let you guys know how it
> goes.
>
> Rodney could you post your memo code that uses the weak references?
>
> - Job
>
>
> On Fri, Sep 18, 2009 at 7:56 AM, Peter Verswyvelen <bugfact at gmail.com>wrote:
>
>> I would also like to see a solution for problems like these.
>>
>> Haskell provides a lot of nice memoizing / caching data structures -
>> like a trie - but the ones I know indeed keep growing, so no garbage
>> collection takes place?
>>
>> It would be nice to have a data structure that performs caching but
>> does not grow unlimited.
>>
>> I had a similar problem with stable names; it is not possible to check
>> if a stable name is still "alive".
>>
>> On Fri, Sep 18, 2009 at 1:39 AM, Rodney Price <rodprice at raytheon.com>
>> wrote:
>> > In my case, the results of each computation are used to generate a node
>> > in a graph structure (dag).  The key, oddly, is a hash of a two-tuple
>> > that gets stored in the data structure after the computation of the
>> > node finishes.  If I don't memoize the function to build a node, the
>> > cost of generating the tree is exponential; if I do, it's somewhere
>> > between linear and quadratic.
>> >
>> > Another process prunes parts of this graph structure as time goes on.
>> > The entire data structure is intended to be persistent, lasting for
>> > days at a time in a server-like application.  If the parts pruned
>> > aren't garbage collected, the space leak will eventually be
>> > catastrophic.  Either the memo table or the graph structure itself will
>> > outgrow available memory.
>> >
>> > -Rod
>> >
>> >
>> > On Thu, 17 Sep 2009 13:32:13 -0400
>> > Job Vranish <jvranish at gmail.com> wrote:
>> >
>> >> What are you trying to use this for? It seems to me that for memo
>> >> tables you almost never have references to they keys outside the
>> >> lookup table since the keys are usually computed right at the last
>> >> minute, and then discarded (otherwise it might be easier to just
>> >> cache stuff outside the function).
>> >>
>> >> For example with a naive fibs, the values you are passing in are
>> >> computed, and probably don't exist before you do the recursive call,
>> >> and then are discarded shortly afterward.
>> >>
>> >> It seems like putting a cap on the cache size, and then just
>> >> overwriting old entries would be better.
>> >> Am I missing something?
>> >>
>> >> - Job
>> >>
>> >>
>> >>
>> >> On Wed, Sep 16, 2009 at 4:48 PM, Rodney Price <rodprice at raytheon.com>
>> >> wrote:
>> >>
>> >> > How does garbage collection work in an example like the one below?
>> >> > You memoize a function with some sort of lookup table, which stores
>> >> > function arguments as keys and function results as values.  As long
>> >> > as the function remains in scope, the keys in the lookup table
>> >> > remain in memory, which means that the keys themselves always
>> >> > remain reachable and they cannot be garbage collected.  Right?
>> >> >
>> >> > So what do you do in the case where you know that, after some
>> >> > period of time, some entries in the lookup table will never be
>> >> > accessed?  That is, there are no references to the keys for some
>> >> > entries remaining, except for the references in the lookup table
>> >> > itself.  You'd like to allow the memory occupied by the keys to be
>> >> > garbage collected.  Otherwise, if the function stays around for a
>> >> > long time, the size of the lookup table always grows.  How do you
>> >> > avoid the space leak?
>> >> >
>> >> > I notice that there is a function in Data.IORef,
>> >> >
>> >> > mkWeakIORef :: IORef a -> IO () -> IO (Weak (IORef a))
>> >> >
>> >> > which looks promising.  In the code below, however, there's only one
>> >> > IORef, so either the entire table gets garbage collected or none of
>> >> > it does.
>> >> >
>> >> > I've been reading the paper "Stretching the storage manager: weak
>> >> > pointers and stable names in Haskell," which seems to answer my
>> >> > question.  When I attempt to run the memoization code in the paper
>> >> > on the simple fib example, I find that -- apparently due to lazy
>> >> > evaluation -- no new entries are entered into the lookup table, and
>> >> > therefore no lookups are ever successful!
>> >> >
>> >> > So apparently there is some interaction between lazy evaluation and
>> >> > garbage collection that I don't understand.  My head hurts.  Is it
>> >> > necessary to make the table lookup operation strict?  Or is it
>> >> > something entirely different that I am missing?
>> >> >
>> >> > -Rod
>> >> >
>> >> >
>> >> > On Thu, 10 Sep 2009 18:33:47 -0700
>> >> > Ryan Ingram <ryani.spam at gmail.com> wrote:
>> >> >
>> >> > >
>> >> > > memoIO :: Ord a => (a -> b) -> IO (a -> IO b)
>> >> > > memoIO f = do
>> >> > >    cache <- newIORef M.empty
>> >> > >    return $ \x -> do
>> >> > >        m <- readIORef cache
>> >> > >        case M.lookup x m of
>> >> > >            Just y -> return y
>> >> > >            Nothing -> do let res = f x
>> >> > >                          writeIORef cache $ M.insert x res m
>> >> > >                          return res
>> >> > >
>> >> > > memo :: Ord a => (a -> b) -> (a -> b)
>> >> > > memo f = unsafePerformIO $ do
>> >> > >     fmemo <- memoIO f
>> >> > >     return (unsafePerformIO . fmemo)
>> >> > >
>> >> > > I don't think there is any valid transformation that breaks this,
>> >> > > since the compiler can't lift anything through unsafePerformIO.
>> >> > > Am I mistaken?
>> >> > >
>> >> > >   -- ryan
>> >> >
>> >> > _______________________________________________
>> >> > Haskell-Cafe mailing list
>> >> > Haskell-Cafe at haskell.org
>> >> > http://www.haskell.org/mailman/listinfo/haskell-cafe
>> >> >
>> >
>> > _______________________________________________
>> > Haskell-Cafe mailing list
>> > Haskell-Cafe at haskell.org
>> > http://www.haskell.org/mailman/listinfo/haskell-cafe
>> >
>> _______________________________________________
>> Haskell-Cafe mailing list
>> Haskell-Cafe at haskell.org
>> http://www.haskell.org/mailman/listinfo/haskell-cafe
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.haskell.org/pipermail/haskell-cafe/attachments/20090918/315d71ac/attachment-0001.html


More information about the Haskell-Cafe mailing list