Why doesn't GHC do the following optimisation?

Wed Nov 12 14:39:36 EST 2003

Hi,

This question is purely out of curiousity - I'm no expert on this.  I
wrote the following code:

sum1 :: [Int] -> Int
sum1 = foldl (+) 0

foldl' :: (a -> b -> a) -> a -> [b] -> a
foldl' f v l = foldr (\x -> \fld -> \v' -> fld $ f v' x) id l v

sum2 :: [Int] -> Int
sum2 = foldl' (+) 0

main :: IO ()
main = do n <- readLn;
          print $ sum1 [1..n];
          print $ sum2 [1..n]

and ran it through ghc with: ghc Folds.hs -ddump-simpl -O

I was expecting the two calls to be optimized to something equivalent, or
even for the result of a single calculation to be printed twice.  However,
peering at the Core output, it seems that the two sum functions exist
separately and have different structure (sum1 appears to have a simple
recursive definition, while sum2 involves two lamba expressions).

I did all this because I read a paper introducing folds that showed how
you can express foldl in terms of foldr and giving the derivation. 
Unfortunately I don't have the paper/derivation, but I believe my foldl'
is correct (that may be the problem, of course - that sum1 and sum2 are
not equivalent either because of a coding error, or because of some
subtlety in the types, although I carefully gave them explicit types to
try and avoid that).

The derivation wasn't that complicated (althugh I find it simpler to
simply write the fold down).  If it's not an error on my part, why doesn't
ghc do the same kind of transformation itself?  Or am  mistaken in
thinking that if derivation is easy then the revrese transformation as an
optimisation should also be easy?

Thanks,
Andrew

-- 
http://www.acooke.org/andrew