Unexpected boxing in generated code

Tue Aug 7 15:52:06 EDT 2007

Simon Peyton-Jones wrote:
> | I've got an inner loop that I think I can see is strict in the Int
> | argument being passed around, but that GHC 6.6.1 isn't unboxing. In the
> | following example both functions take a GHC.Base.Int, which I think
> | should be an Int#.
> 
> OK this is an interesting one. Here's the smallest program that demonstrates the problem.
> 
> foreign import ccall unsafe "stdio.h getchar" getchar :: IO CInt
> 
> f56 :: State# RealWorld -> Int -> Int
> f56 s v2 = case (unIO getchar s) of
>            (# s' , v6  #) ->
>               case v2 of I# _ -> f56 s' v2
> 
> GHC says this is lazy in v2, which it obviously isn't.  Why?  Because there's a special hack (introduced after an earlier bug report) in the strictness analyser to account for the fact that a ccall might exit the program.  Suppose instead of calling 'getchar' we called 'exit'!  Then f56 is not strict in v2 any more.
> 
> Here was a larger program that demonstrated the problem:
> 
>         do { let len = <expensive> ;
>            ; when (...) (exitWith ExitSuccess)
>            ; print len }
> 
> Suppose exitWith doesn't exit; it loops or returns. Then 'len' is sure to be evaluated, and GHC will evaluate it before the 'when'.

exitWith in fact doesn't exit: it raises the exit exception, which is 
caught by the top-level exception handler, which finally arranges to exit. 
   So I imagine the strictness analyser inferred that exitWith returns 
bottom, and hence it was justified in evaluating len first.

This doesn't seem specific to exit, to me.  Throwing any exception would 
trigger this behaviour.  Indeed, since we're in the IO monad, I might 
reasonably expect to have greater control over the evaluation order, and 
perhaps GHC is right - the strictness analyser should not cause something 
to be evaluated earlier than normal if that means moving it past a possible 
effect.  In fact this behaviour seems to be essential if we are to be able 
to use lazy I/O in a sensible way, because otherwise lazy I/O can be 
evaluated earlier than we expect:

   do
     s <- getContents
     putStr "prompt:"; hFlush stdout
     case s of ...

We are sure to evaluate s, but we better not do it before the putStr (I'm 
sure the strictness analyser won't do this right now, because it won't 
infer that putStr returns, but imagine some simpler IO instead).

I'm not quite sure what to make of this.  On the one hand it's ugly, 
because we're forced into an evaluation order.  But even if it weren't for 
lazy I/O, I am tempted to think that the IO monad ought to restrict 
evaluation order, if only so that we can have more control when we want it. 
  So perhaps GHC is doing the right thing.

Cheers,
	Simon