[Haskell] Zipper as a delimited continuation

oleg at pobox.com oleg at pobox.com
Wed Apr 27 19:17:04 EDT 2005


This is the first part of a reply to a query about a zipper with two
foci, posted on this list by Oktaviandi Hadi Nugraha on Apr 13. In
this part we introduce the framework to answer the question.

Our treatment of zipper is quite different from that of Huet (JFP,
1997) and Hinze and Jeuring (JFP 2001). Our zipper is polymorphic over
the data structure to traverse, and the zipper creation procedure is
generic and does not depend on the data structure at all.  Different
data structures or different realizations of the same abstract data
structure can use the same zipper and the same zipper creation and
manipulation functions. Our zipper type depends only on the interface
(but not the implementation!) of a traversal function. Our zipper is a
derivative of a traversal function rather than that of a data
structure itself.

Zipper is a construction that lets us replace an item deep in a
complex data structure, e.g., a tree or a term, without any
mutation. The resulting data structure will share as much of its
components with the old structure as possible. The old data structure
is still available (which can be handy if we wish to 'undo' the
operation later on). Zipper is essentially an `updateable' and yet
pure functional cursor into a data structure.

Zipper is also a delimited continuation reified as a data
structure. In this message, we use delimited continuations directly to
derive the zipper. We will be relying on the delimited continuation
framework by R. Kent Dybvig, Simon Peyton-Jones and Amr Sabry. Their
framework supports shift/reset with multiple prompts and fully
polymorphic answertypes. We will be using a slightly modified version
of the framework, where the CC monad is a transformer rather than the
stand-alone monad.

> {-# OPTIONS -fglasgow-exts #-}
> module Zipper0 where
>
> -- Part of the Dybvig, Sabry, Peyton-Jones' library
> -- adjusted from being a monad to be a monad transformer
> import CC_2CPST
> import Control.Monad.Identity
> import Control.Monad.Trans

We will be using the familiar shift/reset control operators,
implemented as follows:

> promptP :: Monad m => (Prompt r a -> CC r m a) -> CC r m a
> promptP f = do p <- newPrompt; pushPrompt p (f p)
>
> shiftP :: Monad m => 
> 	  Prompt r b -> ((CC r m a -> CC r m b) -> CC r m b) -> CC r m a
> shiftP p f = letSubCont p $ \sk -> 
>                pushPrompt p (f (\c -> 
>                  pushPrompt p (pushSubCont sk c)))

We need multiple prompts in the second part to be posted later.


The derivation of the zipper starts with a term traversal function. The
zipper will be as good and powerful as the traversal function. Let us
adopt as a running example the familiar and dear data structure:

> data Term = Var String | A Term Term | L String Term 
>             deriving (Eq,Show)

In this message, we chose the following function to traverse the lambda-term:

> data Direction = Down | DownRight | Up | Next deriving (Eq, Show)
> traverse :: (Monad m)=> (Term -> m (Maybe Term, Direction)) -> Term -> m Term
> traverse tf term = do
> 		   (term', direction) <- tf term
> 		   let new_term = maybe term id term'
> 		   let select Up t = return t
> 		       select Next t@(Var _) = return t
> 		       select dir t@(L v t1) | dir == Next || dir == Down
> 			   = do
> 			     t1' <- traverse tf t1
> 			     return $ L v t1'
> 		       select DownRight t@(A t1 t2) =
> 			   do
> 			   t2' <- traverse tf t2
> 			   return $ A t1 t2'
> 		       select dir t@(A t1 t2) | dir == Next || dir == Down
> 			   = do
> 			     t1' <- traverse tf t1
> 			     t2' <- traverse tf t2
> 			     return $ A t1' t2'
> 		   select direction new_term

The function `traverse' receives the traversal function `tf' and the
lambda-term, and walks and _updates_ the term, as guided by `tf'.  The
traversal function receives the current subterm and should return a
pair. If the first component of a pair is (Just t'), then t' replaces
the current subterm. The second component of tf's result is the
direction to proceed. The direction Next means proceed in the
depth-first order. Other directions may be used to skip some parts of
the term and so avoid walking the whole term. The function is written
in a monadic style (for an arbitrary monad m). We shall need that
later.

We will be using the following term as a running example:

> -- P2 numeral
> term1 = L "f" (L "x" (A (A f (L "f" (A f (L "f" (L "x" x)))))
> 			  (A (A f (L "f" (L "x" x))) x)))
> 	 where [f,x] = map Var ["f","x"]

The first test simply traverses the whole term, makes no alterations
and returns the (copy of the) term:

> testt1 = runIdentity (traverse (\term -> return (Nothing,Next)) term1)

-- *Zipper0> testt1 == term1
-- True

To make sure that we really traverse the term, we can print out all
the encountered subterms:

> testt2 = traverse tf term1
>     where tf term = print term >> return (Nothing,Next)


We instantiate `traverse' for the IO monad this time. We can skip some
parts of the term during the traversal:

> testt3 = traverse tf term1
>     where 
>     tf term@(A (Var "f") _)  = do
> 			         print "cutting" >> print term
> 			         return (Nothing,Up)
>     tf term = print term >> return (Nothing,Next)

and we can modify the term

> testt4 = runIdentity (traverse tf term1)
>     where tf (L "x" (Var "x")) = return (Just (L "y" (Var "y")),Next)
> 	    tf _ = return (Nothing,Next)

which indeed returns term1 with all occurrences of (L "x" (Var "x"))
replaced with (L "y" (Var "y")).


Now we an introduce the zipper:

> data Zipper r m term dir = 
>     Zipper term (CC r m (Maybe term, dir) -> CC r m (Zipper r m term dir)) 
>   | ZipDone term

It is indeed polymorphic over the term to traverse (as well over the
source monad 'm'). We can use this Zipper, as it is, for _any_
data structure that can be traversed by a function that looks like
`traverse'.

Creating the zipper equally generic:

> zip'term :: (Monad (CC r m), Monad m) =>
> 	    ((term -> CC r m (Maybe term, dir)) -> term -> CC r m term)
> 	    -> term -> CC r m (Zipper r m term dir)
> zip'term trav term = promptP (\p -> trav (tf p) term >>= (return . ZipDone))
>     where tf p term = shiftP p (\k -> return (Zipper term k))

Both Zipper and its creation function depend neither on the
representation of the term nor on the traversal strategy.  All the
information about the data structure and its traversal is
encapsulated in one single function `trav'.

We can now examine term1, subterm by subterm, using the cursor,
zipper, rather than the enumerator, traverse:

> zip'through (ZipDone term) = liftIO (print "Done" >> print term)
> zip'through (Zipper term k) = do
> 			        liftIO $ print term
> 			        k (return (Nothing,Next)) >>= zip'through
> tz1 :: IO () = runCC (zip'term traverse term1 >>= zip'through)

In a sense, we have `inverted' the operation of term traversal. With
the cursor, we can keep arbitrary state from one traversal step to
another.

The cursor provided by zipper is updateable. We can now descend (or
walk) to the desired node, replace it, and zip up the result to yield
the updated term:

> zip'move dir (Zipper term k) = 
>     liftIO (print dir >> print term) >> k (return (Nothing,dir))

> zip'upr (Zipper term k) nt = do 
>        liftIO (print term >> print "replacing with" >> print nt)
>        k (return (Just nt,Up))
> zip'all'the'way'up (ZipDone term) = return term
> zip'all'the'way'up (Zipper term k) = 
>     k (return (Nothing,Up)) >>= zip'all'the'way'up
>
> tz2 :: IO () 
>     = runCC (
> 	 do
>	 z <- zip'term traverse term1
>	 z1 <- zip'move Next z
>	 z1 <- zip'move Next z1
>	 z2@(Zipper (A _ _) k) <- zip'move DownRight z1
>	 res <-  zip'upr z2 (A (Var "x") (Var "x")) >>= zip'all'the'way'up
>	 liftIO $ (print "Result" >> print res)
> 	 --zip'through z1
> 	 )

*Zipper0> tz2
Next
L "f" (L "x" (A (A (Var "f") (L "f" (A (Var "f") (L "f" (L "x" (Var "x"))))))
                (A (A (Var "f") (L "f" (L "x" (Var "x")))) (Var "x"))))
Next
L "x" (A (A (Var "f") (L "f" (A (Var "f") (L "f" (L "x" (Var "x"))))))
         (A (A (Var "f") (L "f" (L "x" (Var "x")))) (Var "x")))
DownRight
A (A (Var "f") (L "f" (A (Var "f") (L "f" (L "x" (Var "x"))))))
  (A (A (Var "f") (L "f" (L "x" (Var "x")))) (Var "x"))
A (A (Var "f") (L "f" (L "x" (Var "x")))) (Var "x")
"replacing with"
A (Var "x") (Var "x")
"Result"
L "f" (L "x" (A (A (Var "f") (L "f" (A (Var "f") (L "f" (L "x" (Var "x"))))))
                (A (Var "x") (Var "x"))))


We should point out that we are given an arbitrary number of cursors
over the same data structure: for example, in `tz2', the cursor 'z1'
is still valid at the end, and still points out to the `third'
subterm. The cursor 'z2' is valid as well. All these cursors are
isolated: the updates done with the cursor 'z2' and invisible to
the cursor 'z1' (as we can see if we uncomment the last statement in
tz2). Essentially, each cursor runs in its own transaction. In the
second part we will discuss how to make cursors that see the updates
of each other. Contrary to the traditional database wisdom, for zippers,
it is far harder to implement lower isolation modes than the higher
ones.


More information about the Haskell mailing list