[Haskell-cafe] Problems with strictness analysis?

wren ng thornton wren at freegeek.org
Mon Nov 3 19:37:19 EST 2008


Patai Gergely wrote:
> Hi everyone,
> 
> I was experimenting with simple accumulator functions, and found that an
> apparently tail recursive function can easily fill the stack. Playing
> around with ghc and jhc gave consistently unpleasant results. Look at
> this program:
> 
> %%%%%%%%%%%
> 
> -- ghc: no, ghc -O3: yes, jhc: no
> isum 0 s = s
> isum n s = isum (n-1) (s+n)

This is tail recursive, and will be optimized to an iterative loop; 
however, the result of this function is a thunk ((...(s+n)+n')...+1) 
which can be potentially quite large. Pulling on this thunk is what 
causes the overflow, since (+) is strict in both arguments and can't 
return partial answers[1]. This function ---or variants like summing a 
list--- seems to be the canonical pitfall for people trying to use tail 
recursion to reason about laziness.

In terms of having a compiler 'smart enough', it's not clear that 
functions of this sort ought to be inferred strict simply because the 
accumulator is ultimately returned to the caller. Consider for example:

  > f 0 xs = xs
  > f n xs = f (n-1) (replicate n n ++ xs)

Since (++) can indeed return partial answers, it's fine for the 
accumulator to be lazy. Indeed, making it strict harms performance 
significantly. Another example is when the accumulator is a function, as 
is common in using tail recursion and CPS together.


The only way, really, to infer that the accumulator should be strict is 
if we know 'enough' about the function used to construct the accumulator 
in order to infer that amortizing the reduction at every iteration is 
equivalent to or better than deferring it. I'm not sure that this can be 
done in general without (an expanded type system and) user annotations. 
Personally I think strictness annotations are a lot clearer than having 
the type system express all strictness relations or distinguishing data 
and codata. Laziness is not the enemy, it merely illuminates the 
question of amortizing costs which eager languages obscure.


[1] Assuming you're not using Peano integers or some other lazy encoding 
in lieu of the built-in Num types.

-- 
Live well,
~wren


More information about the Haskell-Cafe mailing list