Difference between revisions of "Performance/Laziness"

From HaskellWiki
Jump to navigation Jump to search
(I believe this is the canonical version of that phrase...)
m (Fix a typo in the discussion about "two undefined's")
 
(4 intermediate revisions by 4 users not shown)
Line 3: Line 3:
 
== Laziness: Procrastinating for Fun & Profit ==
 
== Laziness: Procrastinating for Fun & Profit ==
   
To look at how laziness works in Haskell, and how to make it do efficient work, we'll implement a merge sort function. It will have the type:
+
To look at how laziness works in Haskell, and how to make it do efficient work, we'll implement a ''merge sort'' function. It will have the type:
   
 
merge_sort :: (Ord a) => [a] -> [a]
 
merge_sort :: (Ord a) => [a] -> [a]
   
We'll also need a function to split the list in two, I'll call this cleaving, and it will look like this:
+
We'll also need a function to split the list in two, I'll call this ''cleaving'', and it will look like this:
   
 
cleave :: [a] -> ([a],[a])
 
cleave :: [a] -> ([a],[a])
   
Let's start by implementing the cleaving function. The conventional way to split a list in merge sort is to take the first N/2 elements off the front, and the remaining elements after this number. The problem is that finding the length of a list in haskell is expensive. So instead, we'll take pairs of elements off the front. Define two functions:
+
Let's start by implementing the cleaving function. The conventional way to split a list in merge sort is to take the first half elements off the front, and the remaining half elements after that. The problem is that finding the length of a list means additional traversing of the whole list. So instead, we'll take ''pairs'' of elements off the front. Define two functions:
   
evens [] = []
+
evens [] = []
 
evens [x] = [x]
 
evens [x] = [x]
 
evens (x:_:xs) = x : evens xs
 
evens (x:_:xs) = x : evens xs
   
odds [] = []
+
odds [] = []
 
odds [x] = []
 
odds [x] = []
 
odds (_:x:xs) = x : odds xs
 
odds (_:x:xs) = x : odds xs
   
and use them to implement cleave:
+
and use them to implement <code>cleave</code>:
   
 
cleave xs = (evens xs, odds xs)
 
cleave xs = (evens xs, odds xs)
   
Experience in a strictly evaluation language like SML or Objective CAML may lead you to write alternate versions using an [[Performance/Accumulating_Parameters | accumulating parameter]]. Assuming that reversing the order of the elements doesn't matter, you could use this function to split the list into even and odd elements and implement the cleave function as follows:
+
Experience in a strictly evaluation language like SML or Objective CAML may lead you to write alternate versions using an [[Performance/Accumulating_Parameters | accumulating parameter]]. Assuming that reversing the order of the elements doesn't matter, you could use this function to split the list into even and odd elements and implement the <code>cleave</code> function as follows:
   
 
cleave = cleave' ([],[]) where
 
cleave = cleave' ([],[]) where
Line 32: Line 32:
 
cleave' (eacc,oacc) (x:x':xs) = cleave' (x:eacc,x':oacc) xs
 
cleave' (eacc,oacc) (x:x':xs) = cleave' (x:eacc,x':oacc) xs
   
This appears to be a better implementation. It's tail recursive, and by either strictness analysis or explicitly making the accumulating parameters strict, it won't blow the stack up. Believe it or not, our first implementation was better.
+
This initially appears to be a better implementation. It's [[tail recursion|tail recursive]], and by either strictness analysis or explicitly making the accumulating parameters strict, it won't blow the stack up... Believe it or not, our first implementation was better.
   
In order to produce the first element of either list, you need to process the entire list. In a non-strict language, we could encounter an infinite list, and we'd like our function to work nicely on them. Consider the effect of:
+
Why? In order to produce the first element of either list, the second version needs to process the entire list. In a non-strict language, we could encounter an infinite list, and we'd like our function to work nicely on them. Consider the effect of:
   
head $ fst $ cleave [0..10000000]
+
head . fst $ cleave [0..10000000]
   
With our first definition, we'll get 0 in constant time. With our second, we'll get it in O(N) time, and our calculation will diverge on an infinite list like [0..].
+
With our first definition, we'll get <math>0</math> in constant time. But with our second, we'll get it in <math>O(n)</math> time, and our calculation will diverge on an infinite list like <code>[0..]</code>.
   
Why is our first version better? Let's look at how evens works and how lists are represented in Haskell. Lists are represented as either an empty list, or a "cons" cell that consists of an element and the remaining list. In pseudo-Haskell, we might write:
+
Let's look at how <code>evens</code> works and how lists are represented in Haskell. Lists are represented as either an empty list, or a ''"cons"'' cell that consists of an element and the remaining list. In pseudo-Haskell, we might write:
   
 
data [a] = [] | a : [a]
 
data [a] = [] | a : [a]
   
In a lazy language, an expression is only evaluated when needed. The machinery used to implement this is called a thunk. It's essentially a value with two possible states: either a computed value, or the process to compute that value. When we assign a value in Haskell, we create a thunk with the instructions to compute the value of the expression we've assigned. When this thunk is forced, these instructions are used to compute a value which is stored in the thunk. The next time the value is required, this computed value is retrieved. Lazyness can be implemented in languages like SML using this method together with mutable references.
+
In a lazy language, an expression is only evaluated when needed, and <code>(:)</code>, a.k.a. ''"cons"'', is a lazy data constructor. The machinery used to implement this is called a thunk. It's essentially a value with two possible states: either a computed value, or the process to compute that value. When we assign a value in Haskell, we create a thunk with the instructions to compute the value of the expression we've assigned. When this thunk is forced, these instructions are used to compute a value which is stored in the thunk. The next time the value is required, this computed value is retrieved. Lazyness can be implemented in languages like SML using this method together with mutable references.
   
So in the recursive case of evens, we produce a thunk that contains a list cons cell. This cons cell contains two thunks, one of the element value, and one of the rest of the list. The thunk for the element is taken from the list the function is operating on, and the thunk for the rest of the list consists of instructions to compute the rest of the list using evens. We'd say that evens and odds are lazy in their input: they consume only enough value to produce the value. As an example of how lazy functions work, consider:
+
So in the recursive case of <code>evens</code>, we produce a thunk that contains a list cons cell. This cons cell contains two thunks, one of the element value, and one of the rest of the list. The thunk for the element is taken from the list the function is operating on, and the thunk for the rest of the list consists of instructions to compute the rest of the list using <code>evens</code>. We'd say that <code>evens</code> and <code>odds</code> are lazy in their input: they consume only enough value to produce the value. As an example of how lazy functions work, consider:
   
 
head ( 5 : undefined )
 
head ( 5 : undefined )
Line 53: Line 53:
 
take 3 $ evens (5 : undefined : 3 : undefined : 1 : undefined : undefined)
 
take 3 $ evens (5 : undefined : 3 : undefined : 1 : undefined : undefined)
   
Despite how all the inputs contain partially undefined values, all the values of the function applications are valid values. A lazy function will only diverge when a required value for its computation diverges. If you're wondering why we have two undefineds at the end of the list, recall how evens was implemented. We need two undefined cells to make sure the third case is selected: the one with two elements followed by a remainder list. Having only one undefined means that after the 1 element, the remainder of the list is undefined. Then we can't decide between the 2nd and 3rd cases. Now let's look at what happens with lazy evaluation and diverging values. Consider:
+
Despite how all the inputs contain partially undefined values, all the values of the function applications are valid values. A lazy function will only diverge when a required value for its computation diverges. If you're wondering why we have two <code>undefined</code>s at the end of the list, recall how <code>evens</code> was implemented. We need two <code>undefined</code>s to make sure the third case is selected: the one with two elements followed by a remainder list (though undefined, but needed because of the <code>take 3</code> call). Having only one <code>undefined</code> means that pattern matching <code>(1:undefined)</code> against either <code>(x:[])</code> or <code>(x:(_:xs))</code> (to choose between the 2nd and 3rd cases) will fail.
  +
  +
Now let's look at what happens with lazy evaluation and diverging values. Consider:
   
 
tail (5 : undefined)
 
tail (5 : undefined)
Line 60: Line 62:
 
take 4 $ evens ( 5 : undefined : 3 : undefined : 1 : undefined : undefined)
 
take 4 $ evens ( 5 : undefined : 3 : undefined : 1 : undefined : undefined)
   
So the application of evens to a non-trivial list results in a thunk being returned immediately. And when we ask for the first element of the list evens produces, we only evaluate the value thunk. This is why we can apply evens or odds, (and cleave for that matter,) to an infinite list. We'll implement merge_sort using cleave:
+
So the application of <code>evens</code> to a non-trivial list results in a thunk being returned immediately. And when we ask for the first element of the list <code>evens</code> produces, we only evaluate the value thunk as much as needed for it to produce its first cons cell, of which the tail is itself a thunk too. This is why we can apply <code>evens</code> or <code>odds</code> (and <code>cleave</code> for that matter) to an infinite list. We'll implement <code>merge_sort</code> using <code>cleave</code>:
   
merge_sort [] = []
+
merge_sort [] = []
 
merge_sort [x] = [x]
 
merge_sort [x] = [x]
merge_sort lst = let (e,o) = cleave lst in merge (merge_sort e) (merge_sort o) where
+
merge_sort lst = let (e,o) = cleave lst
  +
in merge (merge_sort e) (merge_sort o) where
 
merge :: (Ord a) => [a] -> [a] -> [a]
 
merge :: (Ord a) => [a] -> [a] -> [a]
 
merge xs [] = xs
 
merge xs [] = xs
 
merge [] ys = ys
 
merge [] ys = ys
merge xxs@(x:xs) yys@(y:ys) =
+
merge xs@(x:t) ys@(y:u)
| x <= y = x : merge xs yys
+
| x <= y = x : merge t ys
| otherwise = y : merge xxs ys
+
| otherwise = y : merge xs u
  +
 
You can see that this function isn't lazy. It begins by cleaving the list recursively until it is left with trivial lists, i.e. the lists with zero or one elements. These are obviously already sorted. It then uses the nested function <code>merge</code>, which combines two ordered lists while preserves the order. <small>(this function doesn't have to be nested as it does not refer to anything in the outside scope)</small> The act of partitioning the list into trivial lists before the reassembly can begin means that the entire list needs to be accessed before we can begin merging and assembling the output lists on each recursion level. In this case, we can't make a lazier solution, one that would work on an infinite list. This shouldn't be surprising, as in sorting a list, the first element out should be the least (or greatest) of ''all''. We say that <code>merge_sort</code> is strict in the array to be sorted: if an infinite list is supplied, the computation will diverge - no output will ever be produced. There are some operations that cannot be done lazily, for instance, sorting a list.
   
  +
But lazyness still comes into play with the definitions like <code>min xs = head . merge_sort $ xs</code>. In finding the minimal element this way only the necessary amount of comparisons between elements will be performed (<math>O(n)</math> a.o.t. <math>O(n \log n)</math> comparisons needed to fully sort the whole list).
You can see that this function isn't lazy. It begins by cleaving the list recursively until it is left with trivial lists, ones with zero or one elements. These are obviously already sorted. It then uses the nested function merge, which combines two ordered lists and preserves their order. The act of partitioning the list into trivial lists before reassembly can begin means the entire list needs to be processed before we can begin merging them and assembling the list. In this case, we can't make a lazier solution, one that would work on an infinite list. If this is surprising, think of this: in sorting a list, the first element out should be the least (or greatest) how are we to find this element without examining the entire list? We'd say that merge_sort is strict in the array to be sorted: if an infinite list is supplied, the computation will diverge in the sense that output will never be provided. There are some operations that cannot be done lazily, for instance, sorting a list.
 
   
We've seen the difference between a lazy function and a strict function. Lazy computing has two major appeals. The first is that only enough work is done to compute a value. The second is that we can operate in the presence of infinite and undefined data structures, as long as we don't examine the undefined parts or try to process the infinity of values.
+
We've seen the difference between a lazy function and a strict function. Lazy computing has two major appeals. The first is that only the necessary work is done to compute a value. The second is that we can operate in the presence of infinite and undefined data structures, as long as we don't examine the undefined parts or try to process the infinity of values.

Latest revision as of 06:52, 20 April 2012

Haskell Performance Resource

Constructs:
Data Types - Functions
Overloading - FFI - Arrays
Strings - Integers - I/O
Floating point - Concurrency
Modules - Monads

Techniques:
Strictness - Laziness
Avoiding space leaks
Accumulating parameter

Implementation-Specific:
GHC - nhc98 - Hugs
Yhc - JHC

Laziness: Procrastinating for Fun & Profit

To look at how laziness works in Haskell, and how to make it do efficient work, we'll implement a merge sort function. It will have the type:

 merge_sort :: (Ord a) => [a] -> [a]

We'll also need a function to split the list in two, I'll call this cleaving, and it will look like this:

 cleave :: [a] -> ([a],[a])

Let's start by implementing the cleaving function. The conventional way to split a list in merge sort is to take the first half elements off the front, and the remaining half elements after that. The problem is that finding the length of a list means additional traversing of the whole list. So instead, we'll take pairs of elements off the front. Define two functions:

 evens []  = []
 evens [x] = [x]
 evens (x:_:xs) = x : evens xs
 odds []  = []
 odds [x] = []
 odds (_:x:xs) = x : odds xs

and use them to implement cleave:

 cleave xs = (evens xs, odds xs)

Experience in a strictly evaluation language like SML or Objective CAML may lead you to write alternate versions using an accumulating parameter. Assuming that reversing the order of the elements doesn't matter, you could use this function to split the list into even and odd elements and implement the cleave function as follows:

 cleave = cleave' ([],[]) where
     cleave' (eacc,oacc) [] = (eacc,oacc)
     cleave' (eacc,oacc) [x] = (x:eacc,oacc)
     cleave' (eacc,oacc) (x:x':xs) = cleave' (x:eacc,x':oacc) xs

This initially appears to be a better implementation. It's tail recursive, and by either strictness analysis or explicitly making the accumulating parameters strict, it won't blow the stack up... Believe it or not, our first implementation was better.

Why? In order to produce the first element of either list, the second version needs to process the entire list. In a non-strict language, we could encounter an infinite list, and we'd like our function to work nicely on them. Consider the effect of:

 head . fst $ cleave [0..10000000]

With our first definition, we'll get in constant time. But with our second, we'll get it in time, and our calculation will diverge on an infinite list like [0..].

Let's look at how evens works and how lists are represented in Haskell. Lists are represented as either an empty list, or a "cons" cell that consists of an element and the remaining list. In pseudo-Haskell, we might write:

 data [a] = [] | a : [a]

In a lazy language, an expression is only evaluated when needed, and (:), a.k.a. "cons", is a lazy data constructor. The machinery used to implement this is called a thunk. It's essentially a value with two possible states: either a computed value, or the process to compute that value. When we assign a value in Haskell, we create a thunk with the instructions to compute the value of the expression we've assigned. When this thunk is forced, these instructions are used to compute a value which is stored in the thunk. The next time the value is required, this computed value is retrieved. Lazyness can be implemented in languages like SML using this method together with mutable references.

So in the recursive case of evens, we produce a thunk that contains a list cons cell. This cons cell contains two thunks, one of the element value, and one of the rest of the list. The thunk for the element is taken from the list the function is operating on, and the thunk for the rest of the list consists of instructions to compute the rest of the list using evens. We'd say that evens and odds are lazy in their input: they consume only enough value to produce the value. As an example of how lazy functions work, consider:

 head ( 5 : undefined )
 tail [undefined, 5]
 evens (5 : undefined : 3 : undefined : 1 : undefined : [])
 take 3 $ evens (5 : undefined : 3 : undefined : 1 : undefined : undefined)

Despite how all the inputs contain partially undefined values, all the values of the function applications are valid values. A lazy function will only diverge when a required value for its computation diverges. If you're wondering why we have two undefineds at the end of the list, recall how evens was implemented. We need two undefineds to make sure the third case is selected: the one with two elements followed by a remainder list (though undefined, but needed because of the take 3 call). Having only one undefined means that pattern matching (1:undefined) against either (x:[]) or (x:(_:xs)) (to choose between the 2nd and 3rd cases) will fail.

Now let's look at what happens with lazy evaluation and diverging values. Consider:

 tail (5 : undefined)
 head [undefined, 5]
 odds (5 : undefined : 3 : undefined : 1 : undefined : [])
 take 4 $ evens ( 5 : undefined : 3 : undefined : 1 : undefined : undefined)

So the application of evens to a non-trivial list results in a thunk being returned immediately. And when we ask for the first element of the list evens produces, we only evaluate the value thunk as much as needed for it to produce its first cons cell, of which the tail is itself a thunk too. This is why we can apply evens or odds (and cleave for that matter) to an infinite list. We'll implement merge_sort using cleave:

 merge_sort []  = []
 merge_sort [x] = [x]
 merge_sort lst = let (e,o) = cleave lst 
                  in merge (merge_sort e) (merge_sort o) where
     merge :: (Ord a) => [a] -> [a] -> [a]
     merge xs [] = xs
     merge [] ys = ys
     merge xs@(x:t) ys@(y:u)
         | x <= y    = x : merge t ys
         | otherwise = y : merge xs u

You can see that this function isn't lazy. It begins by cleaving the list recursively until it is left with trivial lists, i.e. the lists with zero or one elements. These are obviously already sorted. It then uses the nested function merge, which combines two ordered lists while preserves the order. (this function doesn't have to be nested as it does not refer to anything in the outside scope) The act of partitioning the list into trivial lists before the reassembly can begin means that the entire list needs to be accessed before we can begin merging and assembling the output lists on each recursion level. In this case, we can't make a lazier solution, one that would work on an infinite list. This shouldn't be surprising, as in sorting a list, the first element out should be the least (or greatest) of all. We say that merge_sort is strict in the array to be sorted: if an infinite list is supplied, the computation will diverge - no output will ever be produced. There are some operations that cannot be done lazily, for instance, sorting a list.

But lazyness still comes into play with the definitions like min xs = head . merge_sort $ xs. In finding the minimal element this way only the necessary amount of comparisons between elements will be performed ( a.o.t. comparisons needed to fully sort the whole list).

We've seen the difference between a lazy function and a strict function. Lazy computing has two major appeals. The first is that only the necessary work is done to compute a value. The second is that we can operate in the presence of infinite and undefined data structures, as long as we don't examine the undefined parts or try to process the infinity of values.