[Haskell-beginners] Re: question about styles of recursion

Fri Mar 27 08:55:59 EDT 2009

On Thu, Mar 26, 2009 at 03:23:01PM -0700, Michael Mossey wrote:
>
> But back to the gist of my question last night: I am aware that most 
> examples of recursion presented in the books so far do their processing "as 
> the recursion unwinds." In other words:
>
> length :: [a] -> Int
> length [] = 0
> length (x:xs) = 1 + length xs
>
> This function goes deeper and deeper into the recursion before doing any 
> calculation, then does all the sums "on the way back out."

Right, this is equivalent to 

  length = foldr (+) 0

and results in expressions like (1 + (1 + (1 + (1 + ...)))), which
isn't good in this particular case, since none of the additions can be
performed until the very end of the list is reached, and all the sums
are indeed done "on the way back out".  There are some cases, however
(such as when the result is some data structure that can be computed
lazily, like another list) when this is exactly what you want.

> Being an imperative programmer, I keep trying to write loops that 
> accumulate "on the way down", like this:
>
> length' :: Int -> [a] -> Int
> length' s [] = s
> length' s (x:xs) = length' (s+1) xs
>
> length :: [a] -> Int
> length xs = length' 0 xs

And this is equivalent to

  length = foldl (+) 0

and results in expressions like (((((0 + 1) + 1) + 1) + 1) + ... ).
This looks better at first glance, since the sums can start
accumulating as you go.  However, since Haskell is lazy, this
particular version is no better, because the sums won't be evaluated
until the result is needed anyway: instead of accumulating a number,
you end up accumulating a giant thunk (unevaluated expression) which
is only evaluated when its result is finally needed, after the call to
length has already finished!  So as long as you don't call length on
really long lists, you might as well use your first version---if
you're going to blow the stack on long lists anyway, you might as well
do it in a more natural style. =)   But read on...

What we'd like is some way to force the accumulator to be evaluated as
we recurse down the list---and this is exactly what the foldl'
function (from Data.List) does, by introducing a bit of strictness.
So the best way to write length is actually

  length = foldl' (+) 0.

Whenever you see this 'accumulator' pattern over lists---some
recursive function which recurses over a list while accumulating some
small summary value---think foldl'.

> I suppose both forms are valid, but the first form seems to be most natural 
> in most situations I've encountered in the exercises. 

The first is indeed more natural.  Generally speaking, if you find
yourself using accumulating parameters, there's probably a simpler way
to do it, or some library function that already does exactly what you
want to do.  But it takes experience to learn and recognize such
situations.

-Brent