Personal tools

Performance/Accumulating parameter

From HaskellWiki

< Performance(Difference between revisions)
Jump to: navigation, search
(Made clear what was actually going on after the discussion in "Debunking Tail Recursion.")
(Haskell code markup)
Line 5: Line 5:
 
Often times you will write a recursive function which is almost tail recursive. Consider this na&iuml;ve implementation of a function to compute the length of a list.
 
Often times you will write a recursive function which is almost tail recursive. Consider this na&iuml;ve implementation of a function to compute the length of a list.
   
  +
<haskell>
 
len :: [a] -> Int
 
len :: [a] -> Int
 
len [] = 0
 
len [] = 0
 
len (x:xs) = len xs + 1
 
len (x:xs) = len xs + 1
  +
</haskell>
   
 
Compile and run a simple program like the following:
 
Compile and run a simple program like the following:
   
  +
<haskell>
 
main = putStrLn $ show $ len [1..200000]
 
main = putStrLn $ show $ len [1..200000]
  +
</haskell>
   
 
Profiling the above, I found that it took 0.20 seconds to run. Increasing that number to 1000000 results in a stack overflow.
 
Profiling the above, I found that it took 0.20 seconds to run. Increasing that number to 1000000 results in a stack overflow.
Line 19: Line 23:
 
We can move this increment step into an accumulating parameter. This is an extra parameter that allows us to carry information along in the computation.
 
We can move this increment step into an accumulating parameter. This is an extra parameter that allows us to carry information along in the computation.
   
  +
<haskell>
 
len' :: [a] -> Int -> Int
 
len' :: [a] -> Int -> Int
 
len' [] acc = acc
 
len' [] acc = acc
 
len' (x:xs) acc = len' xs (1 + acc)
 
len' (x:xs) acc = len' xs (1 + acc)
  +
</haskell>
   
 
Now the function is tail recursive. The accumulating parameter is returned in some form by the base case, and in the inductive case, we perform the step that prevented us from being truly tail recursive. We should provide a wrapper function that hides the detail of that accumulating parameter from users of the function.
 
Now the function is tail recursive. The accumulating parameter is returned in some form by the base case, and in the inductive case, we perform the step that prevented us from being truly tail recursive. We should provide a wrapper function that hides the detail of that accumulating parameter from users of the function.
   
  +
<haskell>
 
len xs = len' xs 0
 
len xs = len' xs 0
  +
</haskell>
   
 
Running the above program with the new definition gives me a runtime of 0.02 seconds.
 
Running the above program with the new definition gives me a runtime of 0.02 seconds.
Line 35: Line 43:
 
We need to fix a small problem with regards to the stack overflow. The function would accumulate (1 + acc) thunks as we pass down the list. In general, it makes sense to make your accumulator parameters strict, since its value will be needed at the very end. A better implementation would be:
 
We need to fix a small problem with regards to the stack overflow. The function would accumulate (1 + acc) thunks as we pass down the list. In general, it makes sense to make your accumulator parameters strict, since its value will be needed at the very end. A better implementation would be:
   
  +
<haskell>
 
len' [] acc = acc
 
len' [] acc = acc
 
len' (x:xs) acc = len' xs $! (1 + acc)
 
len' (x:xs) acc = len' xs $! (1 + acc)
  +
</haskell>
   
 
This explicitly evaluates 1 + acc before performing the tail recursion. This prevents the stack overflow.
 
This explicitly evaluates 1 + acc before performing the tail recursion. This prevents the stack overflow.

Revision as of 08:12, 15 June 2007

Haskell Performance Resource

Constructs:
Data Types - Functions
Overloading - FFI - Arrays
Strings - Integers - I/O
Floating point - Concurrency
Modules - Monads

Techniques:
Strictness - Laziness
Avoiding space leaks
Accumulating parameter

Implementation-Specific:
GHC - nhc98 - Hugs
Yhc - JHC

Accumulating Paramaters: Getting rid of the 'almost' in "almost tail recursive"

Often times you will write a recursive function which is almost tail recursive. Consider this naïve implementation of a function to compute the length of a list.

  len :: [a] -> Int
  len [] = 0
  len (x:xs) = len xs + 1

Compile and run a simple program like the following:

  main = putStrLn $ show $ len [1..200000]

Profiling the above, I found that it took 0.20 seconds to run. Increasing that number to 1000000 results in a stack overflow.

The above function seems to delegate most of the actual work to the recursive case. The only thing keeping it from being tail recursive is the requirement to increment the length of the remainder of the list.

We can move this increment step into an accumulating parameter. This is an extra parameter that allows us to carry information along in the computation.

  len' :: [a] -> Int -> Int
  len' [] acc = acc
  len' (x:xs) acc = len' xs (1 + acc)

Now the function is tail recursive. The accumulating parameter is returned in some form by the base case, and in the inductive case, we perform the step that prevented us from being truly tail recursive. We should provide a wrapper function that hides the detail of that accumulating parameter from users of the function.

  len xs = len' xs 0

Running the above program with the new definition gives me a runtime of 0.02 seconds.

The performance gain came from the tail recursive implementation. The problem with the stack overflow was the result of keeping a long list of computations to do after "this next step." The tail recursive version eliminated the need to store all these computational intermediaries.

Accumulating parameters is merely a means to turn an almost tail recursive implementation into a tail recursive implementation. The pattern to apply this technique to are ones which involve a tail recursion and a cons step. This latter step is performed on the accumulator and then passed into the tail recursive step.

We need to fix a small problem with regards to the stack overflow. The function would accumulate (1 + acc) thunks as we pass down the list. In general, it makes sense to make your accumulator parameters strict, since its value will be needed at the very end. A better implementation would be:

  len' [] acc = acc
  len' (x:xs) acc = len' xs $! (1 + acc)

This explicitly evaluates 1 + acc before performing the tail recursion. This prevents the stack overflow.