[Haskell-cafe] More on performance

Loup Vaillant loup.vaillant at gmail.com
Wed Jun 4 05:48:18 EDT 2008


[Forgot to post to the list, sorry]

2008/6/4 Duncan Coutts <duncan.coutts at worc.ox.ac.uk>:
>
> On Wed, 2008-06-04 at 09:32 +0200, Henning Thielemann wrote:
>> On Tue, 3 Jun 2008, Don Stewart wrote:
>>
>> > I wrote up the second part of the tour of understanding low level
>> > performance in GHC here,
>> >
>> >     http://reddit.com/r/programming/info/6lx36/comments/
>> >
>> > Follows on from the discussion last week about various performance
>> > related things.
>>
>> Now the difficult question: How to write the 'mean' function in terms of
>> 'sum' and 'length' while getting the same performance?
>
> There's another rather harder fusion transformation that notices when
> two left folds are demanded in the same strictness context and they fold
> over the same input list then they can be performed together.
>
> sum    = foldl (\s x -> s + x) 0
> length = foldl (\l x -> l + 1) 0
>
> mean xs = sum xs / length xs
>
> So we must note that sum and length are demanded at the same time and
> since they are both foldl's will consume the whole of xs.
>
> So we can merge the two foldl's into one just by tupling them up:
>
> sumlength = foldl (\(s, l) x -> (s + x, l + 1)) (0, 0)
>
> mean xs = s / l
>  where (s, l) = sumlength xs

I see a problem with this particular fusion, though: It changes the
space complexity of the program, from linear to constant. Therefore,
with some programs, relying on this "optimization" can be a matter of
correctness, not just performance. Therefore, if it is not clear when
the compiler will optimize this, I'd rather not use it. (A counter
example is tail calls, which are rather easily deducible, at least in
a strict context)

At least, with more simple fusions, the difference was between
stressing the GC or not. The space and time complexities of the
problem didn't change at all. Only the constants did.

Loup


More information about the Haskell-Cafe mailing list