[Haskell-cafe] Structural sharing in haskell data structures?

Wed May 13 18:58:00 EDT 2009

Jan-Willem Maessen wrote:
> I wanted to clear up one misconception here...
> 
> wren ng thornton wrote:
> > In heavily GCed languages like Haskell allocation and collection is 
> > cheap, so we don't mind too much; but in Java and the like, both 
> > allocation and collection are expensive so the idea of cheap throwaway 
> > objects is foreign.
> 
> Not true!

I was speaking of Java, not Clojure. I believe the costs in Java are 
well documented, though I don't know enough about the JVM to know where 
the blame belongs. (All I know of Clojure is that it's a Lisp-like on 
the JVM :)

> If you look at the internals of Clojure, you'll discover 
> they're using trees with *very* wide fanout (eg fanout-64 leaf trees for 
> lists).  Why?  Because it's so cheap to allocate and GC these 
> structures!  By using shallow-but-wide trees we reduce the cost of 
> indexing and accessing list elements.  I suspect you'd still be 
> hard-pressed to support this kind of allocation behavior in any of the 
> present Haskell implementations, and Haskell implementations of the same 
> kinds of structures have limited fanout to 2-4 elements or so.

I was under the impression that the reason datastructures in Haskell 
tend to be limited to 4-fanout had more to do with the cleanliness of 
the implementations--- pattern matching on 64-wide cells is quite ugly, 
as is dealing with the proliferation of corner cases for complex 
structures like finger trees, patricia trees, etc. The use of view 
patterns could clean this up significantly. On the other hand, we do 
have things like lazy ByteStrings and UVector which do have wide fanouts.

-- 
Live well,
~wren