Lists representations (was: What does FP do well? (was ...))

31 May 2002 19:55:19 +0100

[copied to Zhong Shao and Cordelia Hall]

> [snip - full text included at end]
>
> But the moral for the current discussion: a more intelligent list
> representation could have substantially more benefits for the
> average Haskell program than any compiler optimization twiddling,
> and I'd really like to see someone (PhD student?) investigating that
> topic seriously, as the answers are unlikely to be obvious.
>
> [snip]

I vaguely remember Zhong Shao (Yale) and Cordelia Hall (Glasgow?)
investigating list representations for ML and Haskell respectively.

If I recall correctly (and I'm not sure I do), Zhong and Cordy were
both looking at representations which merged adjacent Cons cells
(roughly) like so:

  data List a = Nil
              | Cons1 a (List a)
              | Cons2 a a (List a)
              | Cons4 a a a a (List a)

[This is very, very rough (if not downright wrong) - partly because I
last saw this about 8 years ago and partly because I'm summarizing two
different pieces of work.]

Zhong's work [1] was in the context of a strict language (SML) which
meant that you can know how long a list is as you are building it so
you can use the Cons4 cells a lot.

Cordy's work [2] was in the context of a lazy language (Haskell) which
meant that you usually don't know the length of a list (if it is even
finite) as you are building it.  This requires a bit of cunningness to
overcome.

IIRC, the key part of that cunningness was that Cordy does the most
interesting stuff near the tail of the list while Zhong does the most
interesting things near the head of the list.

Both implemented this partly by changing the representation and partly
by changing the code their compiler generated.  I think both used type
inference as part of their compiler modification.

-- 
Alastair Reid        reid@cs.utah.edu        http://www.cs.utah.edu/~reid/

[1] Shao, Zhong; Reppy, John H.; Appel, Andrew "Unrolling Lists" ACM
    Conference on Lisp and and Functional Programming, New York, June
    1994. ACM Press.

    http://flint.cs.yale.edu/flint/publications/listrep.pdf

[2] Hall, Cordelia V. "Using Hindley-Milner Type Inference to Optimise
    List Representation" ACM Conference on Lisp and Functional
    Programming, New York, June 1994. ACM Press.

    (not available online??)

> Long away and far ago (or something like that;), there was a
> discussion on Lists implemented as arrays rather than linked
> structures, during which Jerzy Karczmarczuk commented:
> 
> > What bothers me quite strongly is the algorithmic side of operations
> > upon such objects. 
> > 
> > Typical iterations map- (or zip-) style: do something with the head, pass
> > recursively to the tail, would demand "intelligent" arrays, with the indexing
> > header detached from the bulk data itself. The "consumed" part could not be
> > garbage collected. In a lazy language this might possibly produce a considerable
> > amount of rubbish which otherwise would be destroyed quite fast. The
> > concatenation of (parts of) such lists might also have very bad behaviour.
> > 
> > Can you calm my anxiety?
> > 
> > Jerzy Karczmarczuk
> 
> The reason I wanted to reply is that I can offer one data point on
> this. An even longer time ago, there were the various reduction
> systems developed in Kiel, implementing the Kiel Reduction Language
> KiR, a variant of Berkling's reduction languages (the Berkling
> pointed to in Backus' Turing Award Lecture). KiR lacked lots of
> useful features Haskell has, and Haskell's implementations still
> lack lots of useful features KiR's had. I dearly miss those
> features, but that is not the topic here (I don't know whether any
> of the systems still install or even run, but see the Manual at
> http://www.informatik.uni-kiel.de/~base/ for more info).
> 
> The topic here was list representations. KiR's implementations moved
> from interpreted graph-reduction over compiled graph-reduction with
> a code interpreter to compiled graph-reduction with compilation via
> C, all more or less with the same high-level front end. All of these
> represented lists as vectors (KiR was dynamically and implicitly
> typed, btw), and the memory was divided into an area for fixed-size
> descriptors pointing to each other or into the second area, the
> heap, for variable-sized data blocks. The descriptor area was
> reference-counted and simply reused free space (most KiR variants
> were call-by-value), the other area needed memory compactification
> when space grew fragmented.
> 
> A list's elements (or pointers to their descriptors) went into a
> contiguous block in the heap, and the descriptors made it possible
> to share subsequences of elements between different lists
> (descriptors were large enough to hold a pointer to the start of the
> sequence and its length). Quite as Jerzy suspected. Supported
> operations included both array and list operations, append required
> the allocation of a new heap block and copying of *both* lists, but
> was provided as a primitive (the standard approach for systems that
> started out as interpreters: a good mix of efficient primitives and
> interpreted user defined constructs).
> 
> As others have pointed out, this looks rather inefficient,
> especially for longer lists, so when we set out to make measurements
> for a JFP paper [1], comparing with the likes of ghc, we expected to
> be beaten, but hoped to be not too far away, at least with the
> latest via-C implementation..
> 
> Benchmarks are always difficult, but especially so between so
> different languages: in KiR, we could easily and correctly execute
> programs that in Haskell, either wouldn't even compile, or wouldn't
> terminate, or wouldn't show any result (with similar problems the
> other way round). And after adapting to the common subset of
> algorithms, a translation to Haskell might mean that a complex
> program execution might return immediately, as the compiler and
> runtime system lazily determined that none of it was needed for the
> program output (compiled Haskell programs report almost no reduction
> results, only explicit program output, or show-able results).
> 
> With all these preliminaries and caveats, and the standard
> disclaimer that all benchmarks are useless, but interesting, the
> relevant benchmark is the infamous "pretty quicksort" for some 4000
> elements (qusort in the paper - lots of finite lists, traversals,
> fragments and appends, just like the typical Haskell program written
> without any concern for efficiency; Haskell programs concerned with
> efficiency tend to look rather different).
> 
> To our astonishment, even the code interpreting implementation
> (which should otherwise be in the ballpark of Hugs) outperformed
> ghc, hbc, and Clean on this example (call-by-value also played a
> role: compiled sml was in the same area as compiled KiR, but both
> only slightly faster than code-interpreted KiR, so data representation
> and primitives seemed to play the main role). This prompted us to
> include Augustsson's sanitized variant of quicksort (qusortbin in
> the paper - from the hbc libs) as well, which gave the results
> everyone expected (it substantially modifies the algorithm to a
> profile better supported by the current list representation, e.g.,
> no appends). [and before anyone accuses me of advocating functional
> quicksort: the naive quicksort is useless, and even the real one
> isn't the best choice in many cases;-]
> 
> But the moral for the current discussion: a more intelligent list
> representation could have substantially more benefits for the
> average Haskell program than any compiler optimization twiddling,
> and I'd really like to see someone (PhD student?) investigating that
> topic seriously, as the answers are unlikely to be obvious. 
> 
> The representation chosen in the reduction systems could be a first
> hint, but as Jerzy points out, things may be more complicated in the
> context of Haskell.  For comparison, Haskell array performance was
> somewhere between non-existent and terrible in those days (another
> clear win for both the compiled and the interpreted reduction
> systems) and has only recently improved somewhat. That needs to
> continue and, please, someone do the same for lists!
> 
> Just my old 2 Pfennige (former currency;-),
> Claus
> 
> [1] D. Gaertner, W. Kluge: pi-RED+: An Interactive Compiling Graph 
>     Reduction System for an Applied Lambda-Calculus 
>     Journal of Functional Programming, 6 (5), 1996.