# Parallel Haskell: 2-year project to push real world use

Christian Höner zu Siederdissen choener at tbi.univie.ac.at
Mon May 3 21:25:27 EDT 2010

```* Roman Leshchinskiy <rl at cse.unsw.edu.au> [04.05.2010 02:32]:
> On 04/05/2010, at 09:21, Christian Höner zu Siederdissen wrote:
>
> > Hi,
> >
> > on that topic, consider this (rather trivial) array:
> >
> > a = array (1,10) [ (i,f i) | i <-[1..10]] where
> >  f 1 = 1
> >  f 2 = 1
> >  f i = a!(i-1) + a!(i-2)
> >
> > (aah, school ;)
> >
> > Right now, I am abusing vector in ST by doing this:
> >
> > a <- new
> > a' <- freeze a
> > forM_ [3..10] \$ \i -> do
> >  write a (a'!(i-1) + a!(i-2))
> >
> > Let's say I wanted to do something like this in dph (or repa), does that
> > work? We are actually using this for RNA folding algorithms that are at
> > least O(n^3) time. For some of the more advanced stuff, it would be
> > really nice if we could "just" parallelize.
>
> Do you really just need a prefix sum? These are easily parallelisable if the operator is associative. For instance, you could implement the Fibonacci sequence as:
>
> mapP fst \$ scanP (\(a,b) _ -> (a+b,a)) (1,0) \$ replicateP n (0,0)
>
> and DPH would parallelise it. That's how I would write the above with vector as well.

That is, kind of, the fun part: you have

(1) a number of vectors whose values depend on each other (bad!)
(3) rather trivial calculations for each element (mostly:
sum, minimum, fold, map, backpermute) (good!), we have simple semiring
calculations here

>
> > To summarise: I need arrays that allow in-place updates.
>
> In-place updates + parallelism = bad! That's oversimplifying, of course. But the transformations underlying DPH, for instance, simply don't work in the presence of side effects.

The thing is, you can write the algorithm in a way such that each
operation on index "k" (whatever dimension k has) only requires access
to values <=k and those values will never change again. The problem is
that more than one vector is involved making it less fun to write code

>
> > Otherwise, most libraries that do heavy stuff (O(n^3) or worse) are
> > using vector right now. On a single core, it performs really great --
> > even compared to C-code that has been optimized a lot.
>
> That's great to know! Do you (or anyone else) by any chance have any benchmarks you could share? At the moment, I'm only benchmarking vector with a couple of rather simplistic algorithms which is a bit of a problem.

I can make my libraries available under GPLv3, they just need a bit of
love. This gives you a moderately complex algorithm for which there is,
too, a highly optimized C version (RNAfold -d2, in the vienna rna
package).
I am giving a talk wednesday, after that I'll prepare the libraries for
hackage.

Gruss,
Christian
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
```