[Haskell-cafe] Mutable data design question

ajb at spamcop.net ajb at spamcop.net
Sun Dec 5 23:55:25 EST 2004


G'day all.

Quoting Marcin 'Qrczak' Kowalczyk <qrczak at knm.org.pl>:

> 2. Use a persistent data structure with logarithmic cost of most
>    operations: a balanced tree of text fragments, called a rope
>    (Hans Boehm has made one for C). "Undo" can be made by simply
>    keeping old versions.
>
>    Hard to implement the core data structure. If done right, the rest
>    is easy, in particular "undo" handling is very robust. There are
>    some overheads for all operations, but the cost of operations
>    scales to extreme cases.
>
> The second way is more interesting, but I don't know how to implement
> a rope in details.

It's quite easy, actually.

Leaves of the tree are traditionally arrays of chars (with one big array
representing the "original" file), though you could also have something
that represents a lazily read file on disk.

Internal nodes come in two varieties:

    - "Restrictions" have one child.  A restriction represents a
      subsequence of the child node.
    - "Concatenations", which have two children.  A concatenation
      represents two children pasted together.

Using only these, you can implement any editing operation.  A deletion,
for example, is done by restricting the node twice and then pasting the
two restrictions together.

To keep the tree managable, you need to a) eagerly push restrictions as
far down the tree as possible, and b) rebalance.  Rather than using a
traditional balanced binary tree, it's easier to keep track of the
depth of the tree and restructure when it gets too deep.

Note that ropes are optimised for _large_ edits.  So if you're actually
writing a text editor, treating the line currently being edited differently
seems worth it.

Cheers,
Andrew Bromage


More information about the Haskell-Cafe mailing list