[Haskell-cafe] not enough fusion?

Mon Jun 25 13:04:06 CEST 2012

You are right, probably I didn't because I cannot reproduce it now.
Sorry for the noise.
(Anyway, I am still surprised that list-comprehension gives a different
result from do-notation in the list monad.)

L.

On Mon, Jun 25, 2012 at 11:55 AM, Dmitry Olshansky <olshanskydr at gmail.com>wrote:

> In my test it works ~20% faster than s2 and ~20% slower than s1.
> Did you use -O2 flag?
>
>
> 2012/6/25 Lorenzo Bolla <lbolla at gmail.com>
>
>> I wonder why this performs really badly, though (I would expect it to be
>> the same as s2):
>>
>> s3 :: Int -> Int
>> s3 n = sum [gcd x y | x <- [ 0 .. n-1 ], y <- [ 0 .. n-1 ]]
>>
>> From the links posted by Dmitry, it might be that the code generated is
>> made of 2 recursive calls: in fact, what I observe is a "stack space
>> overflow" error on runtime...
>>
>> L.
>>
>>
>>
>>
>> On Mon, Jun 25, 2012 at 10:09 AM, Dmitry Olshansky <olshanskydr at gmail.com
>> > wrote:
>>
>>> s1 ~ sum $ map (sum . flip map [0..n] . gcd) [0..n]
>>> s2 ~ sum $ concatMap (flip map [0..n] . gcd) [0..n]
>>>
>>> There are some posts from Joachim Breitner investigated fusion for
>>> concatMap:
>>>
>>> http://www.haskell.org/pipermail/haskell-cafe/2011-December/thread.html#97227
>>>
>>>
>>>
>>> 2012/6/25 Johannes Waldmann <waldmann at imn.htwk-leipzig.de>
>>>
>>>> Dear all,
>>>>
>>>> while doing some benchmarking (*)
>>>> I noticed that function  s1  is considerably faster than  s2
>>>> (but I wanted  s2  because it looks more natural)
>>>> (for n = 10000,  s1 takes 20 s, s2 takes 13 s; compiled by ghc-7.4.2
>>>> -O2)
>>>>
>>>> s1 :: Int -> Int
>>>> s1 n = sum $ do
>>>>        x <- [ 0 .. n-1 ]
>>>>        return $ sum $ do
>>>>            y <- [ 0 .. n-1 ]
>>>>            return $ gcd x y
>>>>
>>>> s2 :: Int -> Int
>>>> s2 n = sum $ do
>>>>      x <- [ 0 .. n-1 ]
>>>>      y <- [ 0 .. n-1 ]
>>>>      return $ gcd x y
>>>>
>>>> I was expecting that in both programs,
>>>> all lists will be fused away (are they?)
>>>> so the code generator essentially can produce straightforward
>>>> assembly code (no allocations, no closures, etc.)
>>>>
>>>>
>>>> For reference, I also wrote the equivalent imperative program
>>>> (two nested loops, one accumulator for the sum)
>>>> (with the straightforward recursive gcd)
>>>> and runtimes are (for same input as above)
>>>>
>>>> C/gcc: 7.3 s , Java: 7.7 s, C#/Mono: 8.7 s
>>>>
>>>>
>>>> So, they sort of agree with each other, but disagree with ghc.
>>>> Where does the factor 2 come from? Lists? Laziness?
>>>> Does  ghc  turn the tail recursion (in gcd) into a loop? (gcc does).
>>>> (I am looking at  -ddump-asm  but can't quite see through it.)
>>>>
>>>>
>>>> (*) benchmarking to show that today's compilers are clever enough
>>>> such that the choice of paradigm/language does not really matter
>>>> for this kind of low-level programming.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Haskell-Cafe mailing list
>>>> Haskell-Cafe at haskell.org
>>>> http://www.haskell.org/mailman/listinfo/haskell-cafe
>>>>
>>>
>>>
>>> _______________________________________________
>>> Haskell-Cafe mailing list
>>> Haskell-Cafe at haskell.org
>>> http://www.haskell.org/mailman/listinfo/haskell-cafe
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/haskell-cafe/attachments/20120625/10c0b5ab/attachment.htm>