Arrays vs Lists and Specialization

Simon Peyton-Jones simonpj@microsoft.com
Wed, 22 Jan 2003 09:22:26 -0000


Matthew

| Many spectral estimation routines are defined in terms of special
| matrices (ie, Toeplitz, etc).  Arrays defined recursively by list
| comprehensions make it easy to implement algorithms like
Levinson-Durbin
| recursion, and they look very similar to the mathematical definitions:
|=20
| > levinson r p =3D (array (1,p) [ (k, a!(p,k)) | k <- [1..p] ], =
realPart
(rho!p))
| >     where a   =3D array ((1,1),(p,p)) [ ((k,i), ak k i) | k <- =
[1..p],
i <- [1..k] ]
| >	  rho =3D array (1,p) [ (k, rhok k) | k <- [1..p] ]
| >	  ak 1 1             =3D -r!1 / r!0
| >	  ak k i | k=3D=3Di      =3D -(r!k + sum [ a!(k-1,l) * r!(k-l) | l =
<-
[1..(k-1)] ]) /
| rho!(k-1)
| >		 | otherwise =3D a!(k-1,i) + a!(k,k) * (conjugate
(a!(k-1,k-i)))
| >	  rhok 1 =3D (1 - (abs (a!(1,1)))^2) * r!0
| >	  rhok k =3D (1 - (abs (a!(k,k)))^2) * rho!(k-1)
|=20
| OK, my question then has to do with the efficiency of lists versus
| arrays.  Do the latest compilers handle handle arrays efficiently, or
| are lists really the way to go?  If there is a performace difference,
is
| it generally big enough to warrant rewriting algorithms?

Do not rewrite it!  This is a textbook application of Haskell arrays,
and they should work beautifully.  If not, we should fix it.  Using
lists will probably be much *less* efficient than arrays.

People often use arrays with algorithms derived from imperative
programming, which assume update-in-place.   They use (\\) for each
element, which copies the entire array, and get terrible performance.

Haskell arrays are designed to be build en-bloc, as you are doing it.
Your programs are lovely, and will not do redundant array copying.

It is, however, possible that GHC fails to deforest away all the
intermediate lists in the array comprehensions, but I think it does a
reasonable job.  You can use -ddump-simpl to see.


| A related question is how is specilization handled in arrays with lazy
| evaluation.  In the definition of levinson above, the a array is
defined
| in terms of the ak function.  By doing this, you save some horizontal
| space, but it also unburdens the programmer from tracking the
recursive
| dependencies.  a!(k,k) is needed before a!(i,j) can be calculated, but
| lazy evaluation takes care of this.

Yes.  The efficiency penalty is that every element is built as a thunk
and only later evaluated on demand.

| If the above function is
| specialized for r::Array Int (Complex Double) and p::Int, would I be
| correct to say that the final value of the function would be unboxed,
| but all intermediate values wouldn't?=20

No, the function returns an array, which is always boxed.  Its
*elements* may be unboxed, but only if you, the programmer, specify them
to be unboxed.  GHC will not unbox array elements automatically, because
that makes them strict, and that changes the semantics.

To use unboxed arrays, you need the UArray library (documented in GHC's
libraries).  If the algorithm does not use laziness, you can often unbox
array elements just by changing "Array" to "UArray" in the types.  Easy.
But that makes it strict, and that may make your algorithm fail.  No
magic here.

| Now, in some cases, a user may
| need all of the model orders from 1..p.  This is handled easilly
enough
| by just changing the first line. to
|=20
| > levinson r p =3D (a, fmap realPart rho)
|=20
| Would the a matrix in the tuple be unboxed with specilization?

Not sure what you mean here, but I don't think so.

| If anyone is interesting in what I have put together, I will be making
| everything public sometime next week.  I have a lot of algorithms
| implemented, but I need to clean up the documentation a bit (well, a
| lot).

Would you like to write it up as a paper for the Journal of Functional
Programming?=20

By the way, do you know of David Goblirsch's work
http://delivery.acm.org/10.1145/330000/326801/p425-goblirsch.pdf?key1=3D3=
2
6801&key2=3D1086223401&coll=3DGUIDE&dl=3DGUIDE&CFID=3D7058054&CFTOKEN=3D5=
5810940

I'm not sure where he is now. =20

Simon