Personal tools

Memoization

From HaskellWiki

(Difference between revisions)
Jump to: navigation, search
(distinction between recursive and non-recursive case)
m (See also: Fixed the links to the haskell-cafe archives in the first four entries)
(16 intermediate revisions by 11 users not shown)
Line 3: Line 3:
 
'''Memoization''' is a technique for storing values of a function instead of recomputing them each time the function is called.
 
'''Memoization''' is a technique for storing values of a function instead of recomputing them each time the function is called.
   
== Memoization of arbitrary functions ==
+
== Memoization without recursion ==
 
You need type classes to get a reasonable type for the function you want
 
   
  +
You can just write a memoization function using a data structure that is suitable for your application.
  +
We don't go into the details of this case.
  +
If you want a general solution for several types,
  +
you need a type class, say <hask>Memoizable</hask>.
 
<haskell>
 
<haskell>
 
memoize :: Memoizable a => (a->b) -> (a->b)
 
memoize :: Memoizable a => (a->b) -> (a->b)
Line 10: Line 14:
   
 
Now, how to implement something like this? Of course, one needs a finite
 
Now, how to implement something like this? Of course, one needs a finite
map that stores values b for keys of type a. It turns out that such a
+
map that stores values <hask>b</hask> for keys of type <hask>a</hask>.
map can be constructed recursively based on the structure of a:
+
It turns out that such a map can be constructed recursively based on the structure of <hask>a</hask>:
 
 
<haskell>
 
<haskell>
 
Map () b := b
 
Map () b := b
Line 18: Line 22:
 
</haskell>
 
</haskell>
   
Here, Map a b is the type of a finite map from keys a to values b. Its
+
Here, <hask>Map a b</hask> is the type of a finite map from keys <hask>a</hask> to values <hask>b</hask>.
construction is based on the following laws for functions
+
Its construction is based on the following laws for functions
 
 
<haskell>
 
<haskell>
 
() -> b =~= b
 
() -> b =~= b
Line 28: Line 32:
 
For further and detailed explanations, see
 
For further and detailed explanations, see
   
* R. Hinze: [http://www.informatik.uni-bonn.de/~ralf/publications.html#P11 Memo functions, polytypically!]
+
* Ralf Hinze: [http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.43.3272 Memo functions, polytypically !]
* R. Hinze: [http://www.informatik.uni-bonn.de/~ralf/publications.html#J4 Generalizing generalized tries]
+
* Ralf Hinze: [http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.8.4069 Generalizing generalized tries]
  +
* Conal Elliott: [http://conal.net/blog/posts/elegant-memoization-with-functional-memo-tries/ Elegant memoization with functional memo tries] and other [http://conal.net/blog/tag/memoization/ posts on memoization].
  +
* Conal Elliott [http://conal.net/papers/type-class-morphisms/ Denotational design with type class morphisms], section 9 (Memo tries).
   
== Memoization of recursively defined functions ==
+
== Memoization with recursion ==
   
 
Things become more complicated if the function is recursively defined
 
Things become more complicated if the function is recursively defined
and it shall used memoized calls to itself.
+
and it should use memoized calls to itself.
 
A classic example is the recursive computation of [[The Fibonacci sequence|Fibonacci numbers]].
 
A classic example is the recursive computation of [[The Fibonacci sequence|Fibonacci numbers]].
   
Line 51: Line 55:
 
<haskell>
 
<haskell>
 
memoized_fib :: Int -> Integer
 
memoized_fib :: Int -> Integer
memoized_fib =
+
memoized_fib = (map fib [0 ..] !!)
let fib 0 = 0
+
where fib 0 = 0
fib 1 = 1
+
fib 1 = 1
fib n = memoized_fib (n-2) + memoized_fib (n-1)
+
fib n = memoized_fib (n-2) + memoized_fib (n-1)
in (map fib [0 ..] !!)
 
 
</haskell>
 
</haskell>
   
   
== Memoizing fix point operator ==
+
=== Memoizing fix point operator ===
   
 
You can factor out the memoizing trick to a function, the memoizing fix point operator,
 
You can factor out the memoizing trick to a function, the memoizing fix point operator,
Line 91: Line 95:
 
</haskell>
 
</haskell>
   
Here we use a special tree as memoizing data structure.
+
== Efficient tree data structure for maps from Int to somewhere ==
  +
  +
Here we present a special tree data type
  +
({{HackagePackage|id=data-inttrie}})
  +
which is useful as memoizing data structure e.g. for the Fibonacci function.
 
<haskell>
 
<haskell>
memoFix :: ((Int -> Integer) -> (Int -> Integer)) -> (Int -> Integer)
+
memoizeInt :: (Int -> a) -> (Int -> a)
memoFix f =
+
memoizeInt f = (fmap f (naturals 1 0) !!!)
let memo = fmap (f mf) (naturals 1 0)
 
mf = (memo !!!)
 
in mf
 
 
</haskell>
 
</haskell>
   
 
A data structure with a node corresponding to each natural number to use as a memo.
 
A data structure with a node corresponding to each natural number to use as a memo.
 
 
<haskell>
 
<haskell>
 
data NaturalTree a = Node a (NaturalTree a) (NaturalTree a)
 
data NaturalTree a = Node a (NaturalTree a) (NaturalTree a)
Line 151: Line 154:
 
</haskell>
 
</haskell>
   
  +
==Memoising CAFS==
  +
'''Note: This is migrated from the old wiki.'''
  +
  +
Memoising constructor functions gives you HashConsing, and you can sometimes use MemoisingCafs ([[constant applicative form]]s) to implement that.
  +
  +
The MemoisingCafs idiom also supports recursion.
  +
  +
Consider, for example:
  +
  +
<haskell>
  +
wonderous :: Integer -> Integer
  +
wonderous 1 = 0
  +
wonderous x
  +
| even x = 1 + wonderous (x `div` 2)
  +
| otherwise = 1 + wonderous (3*x+1)
  +
</haskell>
  +
This function is not at all understood by mathematicians and has a surprisingly complex recursion pattern, so if you need to call it many times with different values, optimising it would not be easy.
  +
  +
However, we can memoise some of the domain using an array CAF:
  +
<haskell>
  +
wonderous2 :: Integer -> Integer
  +
wonderous2 x
  +
| x <= maxMemo = memoArray ! x
  +
| otherwise = wonderous2' x
  +
where
  +
maxMemo = 100
  +
memoArray = array (1,maxMemo)
  +
[ (x, wonderous2' x) | x <- [1..maxMemo] ]
  +
  +
wonderous2' 1 = 0
  +
wonderous2' x
  +
| even x = 1 + wonderous2 (x `div` 2)
  +
| otherwise = 1 + wonderous2' (3*x+1)
  +
</haskell>
  +
When using this pattern in your own code, note carefully when to call the memoised version (wonderous2 in the above example) and when not to. In general, the partially memoised version (wonderous2' in the above example) should call the memoised version if it needs to perform a recursive call. However, in this instance, we only memoize for small values of x, so the branch of the recursion that passes a larger argument need not bother checking the memo table. (This does slow the array initialization, however.)
  +
Thanks to [[lazy evaluation]], we can even memoise an infinite domain, though we lose constant time lookup. This data structure is O(log N):
  +
  +
<haskell>
  +
type MemoTable a = [(Integer, BinTree a)]
  +
data BinTree a = Leaf a | Node Integer (BinTree a) (BinTree a)
  +
  +
wonderous3 :: Integer -> Integer
  +
wonderous3 x
  +
= searchMemoTable x memoTable
  +
where
  +
memoTable :: MemoTable Integer
  +
memoTable = buildMemoTable 1 5
  +
  +
buildMemoTable n i
  +
= (nextn, buildMemoTable' n i) : buildMemoTable nextn (i+1)
  +
where
  +
nextn = n + 2^i
  +
  +
buildMemoTable' base 0
  +
= Leaf (wonderous3' base)
  +
buildMemoTable' base i
  +
= Node (base + midSize)
  +
(buildMemoTable' base (i-1))
  +
(buildMemoTable' (base + midSize) (i-1))
  +
where
  +
midSize = 2 ^ (i-1)
  +
  +
searchMemoTable x ((x',tree):ms)
  +
| x < x' = searchMemoTree x tree
  +
| otherwise = searchMemoTable x ms
  +
  +
searchMemoTree x (Leaf y) = y
  +
searchMemoTree x (Node mid l r)
  +
| x < mid = searchMemoTree x l
  +
| otherwise = searchMemoTree x r
  +
  +
wonderous3' 1 = 0
  +
wonderous3' x
  +
| even x = 1 + wonderous3 (x `div` 2)
  +
| otherwise = 1 + wonderous3 (3*x+1)
  +
</haskell>
  +
  +
Naturally, these techniques can be combined, say, by using a fast CAF data structure for the most common part of the domain and an infinite CAF data structure for the rest.
  +
  +
-- [[AndrewBromage]]
  +
  +
== Memoizing polymorphic functions ==
  +
  +
What about memoizing polymorphic functions defined with polymorphic recursion?
  +
How can such functions be memoized?
  +
The caching data structures used in memoization typically handle only one type of argument at a time.
  +
For instance, one can have finite maps of differing types, but each concrete finite map holds just one type of key and one type of value.
   
  +
See the discussion on *Memoizing polymorphic functions*, [http://conal.net/blog/posts/memoizing-polymorphic-functions-part-one/ part one] and [http://conal.net/blog/posts/memoizing-polymorphic-functions-part-two/ part two].
   
 
== See also ==
 
== See also ==
   
* [http://www.haskell.org/pipermail/haskell-cafe/2007-February/022590.html Haskell-Cafe "speeding up fibonacci with memoizing"]
+
* [http://www.haskell.org/pipermail/haskell-cafe/2007-February/021288.html Haskell-Cafe "speeding up fibonacci with memoizing"]
* [http://www.haskell.org/pipermail/haskell-cafe/2007-May/025991.html Haskell-Cafe about memoization utility function]
+
* [http://www.haskell.org/pipermail/haskell-cafe/2007-May/024689.html Haskell-Cafe about memoization utility function]
* [http://www.haskell.org/pipermail/haskell-cafe/2007-February/022865.html Haskell-Cafe "memoisation"]
+
* [http://www.haskell.org/pipermail/haskell-cafe/2007-February/021563.html Haskell-Cafe "memoisation"]
* [http://www.haskell.org/pipermail/haskell-cafe/2005-October/011589.html Haskell-Cafe about Memoization and Data.Map]
+
* [http://www.haskell.org/pipermail/haskell-cafe/2005-October/010287.html Haskell-Cafe about Memoization and Data.Map]
 
* http://programming.reddit.com/info/16ofr/comments
 
* http://programming.reddit.com/info/16ofr/comments
  +
* [http://www.cs.utexas.edu/~wcook/Drafts/2006/MemoMixins.pdf Monadic Memoization Mixins] by Daniel Brown and William R. Cook
  +
* [http://hackage.haskell.org/cgi-bin/hackage-scripts/package/data-memocombinators data-memocombinators: Combinators for building memo tables.]
  +
* [http://hackage.haskell.org/cgi-bin/hackage-scripts/package/MemoTrie MemoTrie: Trie-based memo functions]
  +
* [http://hackage.haskell.org/package/monad-memo monad-memo: memoization monad transformer]
  +
* [http://hackage.haskell.org/package/memoize memoize: uses Template Haskell to derive memoization code]

Revision as of 12:52, 23 July 2013


Memoization is a technique for storing values of a function instead of recomputing them each time the function is called.

Contents

1 Memoization without recursion

You can just write a memoization function using a data structure that is suitable for your application. We don't go into the details of this case. If you want a general solution for several types,

you need a type class, say
Memoizable
.
memoize :: Memoizable a => (a->b) -> (a->b)

Now, how to implement something like this? Of course, one needs a finite

map that stores values
b
for keys of type
a
. It turns out that such a map can be constructed recursively based on the structure of
a
:
  Map ()            b  := b
  Map (Either a a') b  := (Map a b, Map a' b)
  Map (a,a')        b  := Map a (Map a' b)
Here,
Map a b
is the type of a finite map from keys
a
to values
b
.

Its construction is based on the following laws for functions

        () -> b  =~=  b
  (a + a') -> b  =~=  (a -> b) x (a' -> b) -- = case analysis
  (a x a') -> b  =~=  a -> (a' -> b)       -- = currying

For further and detailed explanations, see

2 Memoization with recursion

Things become more complicated if the function is recursively defined and it should use memoized calls to itself. A classic example is the recursive computation of Fibonacci numbers.

The naive implementation of Fibonacci numbers without memoization is horribly slow.

Try
slow_fib 30
, not too much higher than that and it hangs.
slow_fib :: Int -> Integer
slow_fib 0 = 0
slow_fib 1 = 1
slow_fib n = slow_fib (n-2) + slow_fib (n-1)

The memoized version is much faster.

Try
memoized_fib 10000
.
memoized_fib :: Int -> Integer
memoized_fib = (map fib [0 ..] !!)
   where fib 0 = 0
         fib 1 = 1
         fib n = memoized_fib (n-2) + memoized_fib (n-1)


2.1 Memoizing fix point operator

You can factor out the memoizing trick to a function, the memoizing fix point operator,

which we will call
memoFix
here.
fib :: (Int -> Integer) -> Int -> Integer
fib f 0 = 1
fib f 1 = 1
fib f n = f (n-1) + f (n-2)
 
fibonacci :: Int -> Integer
fibonacci = memoFix fib

I suppose if you want to "put it in a library",

you should just put
fib
in, and allow the user to call
memoFix fib
to make a new version when necessary.

This allows the user e.g. to define the data structure used for memoization.

The memoising fixpoint operator works by putting the result of the first call of the function for each natural number into a data structure and using that value for subsequent calls ;-)

In general it is

memoFix :: ((a -> b) -> (a -> b)) -> a -> b
memoFix f =
   let mf = memoize (f mf) in mf

3 Efficient tree data structure for maps from Int to somewhere

Here we present a special tree data type (data-inttrie) which is useful as memoizing data structure e.g. for the Fibonacci function.

memoizeInt :: (Int -> a) -> (Int -> a)
memoizeInt f = (fmap f (naturals 1 0) !!!)

A data structure with a node corresponding to each natural number to use as a memo.

data NaturalTree a = Node a (NaturalTree a) (NaturalTree a)

Map the nodes to the naturals in this order:

     0
   1   2
  3 5 4 6
 7  ...

Look up the node for a particular number

Node a tl tr !!! 0 = a 
Node a tl tr !!! n =
   if odd n
     then tl !!! top
     else tr !!! (top-1)
        where top = n `div` 2

We surely want to be able to map on these things...

instance Functor NaturalTree where
   fmap f (Node a tl tr) = Node (f a) (fmap f tl) (fmap f tr)

If only so that we can write cute, but inefficient things like the below,

which is just a
NaturalTree
such that
naturals!!!n == n
:
naturals = Node 0  (fmap ((+1).(*2)) naturals) (fmap ((*2).(+1)) naturals)

The following is probably more efficient (and, having arguments won't hang around at top level, I think)

-- have I put more
$!
s than necessary?
naturals r n =
   Node n
     ((naturals $! r2) $! (n+r))
     ((naturals $! r2) $! (n+r2))
        where r2 = 2*r

4 Memoising CAFS

Note: This is migrated from the old wiki.

Memoising constructor functions gives you HashConsing, and you can sometimes use MemoisingCafs (constant applicative forms) to implement that.

The MemoisingCafs idiom also supports recursion.

Consider, for example:

wonderous :: Integer -> Integer
wonderous 1 = 0
wonderous x
  | even x    = 1 + wonderous (x `div` 2)
  | otherwise = 1 + wonderous (3*x+1)

This function is not at all understood by mathematicians and has a surprisingly complex recursion pattern, so if you need to call it many times with different values, optimising it would not be easy.

However, we can memoise some of the domain using an array CAF:

wonderous2 :: Integer -> Integer
wonderous2 x
  | x <= maxMemo = memoArray ! x
  | otherwise    = wonderous2' x
  where
        maxMemo = 100
        memoArray = array (1,maxMemo)
                        [ (x, wonderous2' x) | x <- [1..maxMemo] ]
 
        wonderous2' 1 = 0
        wonderous2' x
          | even x    = 1 + wonderous2 (x `div` 2)
          | otherwise = 1 + wonderous2' (3*x+1)

When using this pattern in your own code, note carefully when to call the memoised version (wonderous2 in the above example) and when not to. In general, the partially memoised version (wonderous2' in the above example) should call the memoised version if it needs to perform a recursive call. However, in this instance, we only memoize for small values of x, so the branch of the recursion that passes a larger argument need not bother checking the memo table. (This does slow the array initialization, however.) Thanks to lazy evaluation, we can even memoise an infinite domain, though we lose constant time lookup. This data structure is O(log N):

type MemoTable a = [(Integer, BinTree a)]
data BinTree a = Leaf a | Node Integer (BinTree a) (BinTree a)
 
wonderous3 :: Integer -> Integer
wonderous3 x
  = searchMemoTable x memoTable
  where
        memoTable :: MemoTable Integer
        memoTable = buildMemoTable 1 5
 
        buildMemoTable n i
            = (nextn, buildMemoTable' n i) : buildMemoTable nextn (i+1)
            where
                nextn = n + 2^i
 
                buildMemoTable' base 0
                    = Leaf (wonderous3' base)
                buildMemoTable' base i
                    = Node (base + midSize)
                           (buildMemoTable' base (i-1))
                           (buildMemoTable' (base + midSize) (i-1))
                    where
                        midSize = 2 ^ (i-1)
 
        searchMemoTable x ((x',tree):ms)
            | x < x'    = searchMemoTree x tree
            | otherwise = searchMemoTable x ms
 
        searchMemoTree x (Leaf y) = y
        searchMemoTree x (Node mid l r)
            | x < mid   = searchMemoTree x l
            | otherwise = searchMemoTree x r
 
        wonderous3' 1 = 0
        wonderous3' x
          | even x    = 1 + wonderous3 (x `div` 2)
          | otherwise = 1 + wonderous3 (3*x+1)

Naturally, these techniques can be combined, say, by using a fast CAF data structure for the most common part of the domain and an infinite CAF data structure for the rest.

-- AndrewBromage

5 Memoizing polymorphic functions

What about memoizing polymorphic functions defined with polymorphic recursion? How can such functions be memoized? The caching data structures used in memoization typically handle only one type of argument at a time. For instance, one can have finite maps of differing types, but each concrete finite map holds just one type of key and one type of value.

See the discussion on *Memoizing polymorphic functions*, part one and part two.

6 See also