Ryan Ingram ryani.spam at gmail.com
Sun Nov 23 02:41:38 EST 2008

```On Sat, Nov 22, 2008 at 1:20 PM, Jason Dusek <jason.dusek at gmail.com> wrote:
> Ryan Ingram <ryani.spam at gmail.com> wrote:
>> ...persistent data structures tend to have much worse constant
>> factors and those factors translate to a general 2x-3x
>> slowdown.
>
>  Can you explain why that is, or provide a citation for it?
>  It's not something I've found easy to Google.

Consider insertion into a simple binary tree (no balancing condition).

The persistent algorithm is:

insert :: Key -> Tree -> Tree
insert k Tip = Node k Nil Nil
insert k (Node k' l r)
| k < k' = Node k' (insert k l) r
| otherwise = Node k' l (insert k r)

The ephemeral algorithm is:

insert :: Key -> IORef Tree -> IO ()
insert k p = do
case t of
Tip -> do
l <- newIORef Tip
r <- newIORef Tip
writeIORef p (Node k l r)
Node k' l r -> insert k \$ if k < k' then l else r

The big difference between these two algorithms is the amount of
allocation and copying going on.  Both dereference basically the same
number of pointers.  The ephemeral algorithm allocates exactly one new
node and modifies exactly one pointer in memory.  The persistent
algorithm, on the other hand, copies the entire path being traversed
down the tree, allocating that many nodes as well.  (All of the "Tip"
nodes can be shared; it can be treated like "NULL" in C)

Unfortunately, I don't have any references; the 2-3x is an intuitive
number from my past experience.  It's worse for algorithms where you
need to explicitly simulate pointers with maps because the structure
is inherently ephemeral, and better for simple structures like the
aforementioned binary tree.

-- ryan
```