[Haskell-cafe] Re: Performance Help

Mon Mar 19 07:53:19 EDT 2007

>> Fusing the ws is trickier. Directly appealing to the fibonacci-number
>> example is not recommended because this would mean to keep the last 16
>> ws in memory and shifting them right to left by hand. But as the
> 
> Are you saying this because we don't want a 16-tuple?

Exactly.

>> "Alternate method of computation" on the website suggests, you can
>> delegate the shifting to an index that shifts around mod 16. Having a
>> mutable array is helpful, then.
> 
> Are you saying that because haskell doesn't really have mutable arrays, this 
> is ruled out?

Yes, a bit. But you can use ST-arrays if you want.

>> Of course, you can also fill a large static (boxed!) array of 80 Word8s via
>>
>>   ws :: Data.Array.IArray.Array Int Word8
>>   ws = accumArray 0 (0,80) (curry snd) $
>>        zip [0..15] xs ++ [(i, xxor i) | i<-[16..80]]
>>       where
>>       xxor i = ws ! (i-16) `xor`
>>            ws ! (i-3) `xor` ws ! (i-8) `xor` ws ! (i-14)
>>
>> or something like that (I didn't care about correct indices and bounds).
>> GHC can fuse such array accumulations.
> 
> Why is an array better than a list? Is it because (!) is O(1) and (!!) is 
> O(n)?

In this case, we don't need random access anyway because the 80th
element is only available once the previous ones have been calculated.
In a sense, lists are perfectly fine. The only point of introducing
arrays is that current CPU architecture is optimized to access
contiguous blocks of memory and performs much worse on linked lists
(besides the fact that lazy evaluation introduces another tiny overhead).

Note that the above immutable array has lazy elements as well, so I'm
unsure how much benefit they bring. You'd have to try.

Maybe it's best to separate the choice between lists and arrays from the
actual algorithm with the help of a higher order function that
encapsulates the essence of evaluating a recurrence equation. Something like

  recurrence :: Int -> (Int -> a) -> ((Int -> a) -> a) -> Int -> a

that satisfies the following laws (beware, they're not suitable for
direct implementation)

  recurrence w start = recurrence w (clip w start)

  recurrence w start f k
     | 1 <= k && k <= w = start k
     | otherwise = f (clip w (\i -> recurrence w start f (k-i)))

where

  clip w f = \x -> if 1 <= x && x <= w then f x else undefined

clips a function to accept only values between 1 and w. This clipping is
a bit ad-hoc and you can play various tricks with lightweight dependent
types to define functions that are statically known to be defined on the
interval [1..w].

For example, the Fibonacci numbers can then be defined as

  fib n = recurrence 2 (\i -> [1,1] !! i) (\g -> g 1 + g 2) n

In other words, the first higher order argument is the list of starting
values and the second higher order arguments is the recurrence equation
that defines how to calculate a new element from the 1..w previous ones.

Now, you can implement 'recurrence' with lists or arrays or mutable
arrays, whatever is fastest.

>> In general, keeping stuff in lists is not wrong, but ByteStrings are
>> more adapted to current CPU and RAM architecture.
> 
> I'm not clear how ByteStrings help here. We need to put bits into 32 words and 
> operate on these.

Ah yes. I think there was some semi-published library of "Byte"-Strings
for any instances of Storable including Word32. Dons knows more.

Regards,
apfelmus