[Haskell-beginners] Functional Data Structure with Cheapest Modification?

Daniel Fischer daniel.is.fischer at web.de
Tue Apr 6 15:51:47 EDT 2010


Am Dienstag 06 April 2010 21:31:30 schrieb Travis Erdman:
> In Real World Haskell, the authors state that "the attraction of a tree
> to a functional programmer is cheap modification."  They go on to
> explain that in a tree of say 10,000 nodes, approx 9,985 would be reused
> (on average) when updating a single node (rather than having to copy all
> 10k nodes, like a Haskell list would have to do).
>
> In my application, I have a collection of (only) about 200 objects.  So
> from the surface, not large at all.  However, each object in the
> collection is a large tree that might be up to 20 MB in size.  The
> application pulls objects one at a time from the collection, performs
> some processing, and delivers a new, updated object that needs to be
> inserted in the collection for future processing (the original object is
> now obsolete).  I'm guessing a Data.IntMap might involve copying over
> about 8 objects for each single object update, which while better than
> copying all 200, is still not perfect.
>
> What would be the most efficient Haskelly way to maintain my collection?
>
> I'm wondering also if there is some sort of "pointer list" I could use. 
> I think recreating a list of 200 pointers (199 of which are unchanged)
> might be a pretty efficient solution, if it exists.
>
> Sadly, I've tried to understand Mutable Arrays (of a few different
> flavors), but all the monadic stuff is over my head at the moment, and I
> couldn't find sufficient documentation/tutorial that I could understand
> to implement them.
>

An immutable array should be fine for this. When you modify one tree, you 
create a new array of 200 pointers to trees, 199 of them are just copied 
from the old array, one points to your new modified tree.

Of course, with STArrays, there would be no need to copy any pointers and 
allocate new arrays, so in principle it should be even more efficient, but 
boxed mutable arrays and the garbage collector don't play too well together
(http://hackage.haskell.org/trac/ghc/ticket/650), so it might not be 
better.


More information about the Beginners mailing list