[Haskell-cafe] Monad instance for Data.Set

Sun Mar 30 21:08:06 EDT 2008

On Sun, Mar 30, 2008 at 1:09 PM, Henning Thielemann
<lemming at henning-thielemann.de> wrote:
>  It's like working in the List monad mainly, collapsing duplicates from
>  time to time, right?

Sort of.  You can look at it that way and get a basic understanding of
what's going on.

A slightly more accurate analysis of what is going on is that it is
working in ContT Set for a variation of ContT that doesn't require the
underlying object to be a full monad, but only a restricted one.

In such a monad you could define
> mplus :: ContT Set a -> ContT Set a -> ContT Set a
> mplus x y = lift $ union (runContT x id) (runContT y id)
(not valid haskell code)

However, what is actually happening is that we are defining a set of
"side-effectful" computations using Prompt, and then observing those
computations in a "Set" environment.  With this definition you can
actually implement the interface for any monad you want; just define
the operations in your data type.  In this case:

> data OrdP m a where
>    PZero :: OrdP m a
>    PRestrict :: Ord a => m a -> OrdP m a
>    PPlus :: Ord a => m a -> m a -> OrdP m a

> type SetM = RecPrompt OrdP

Every monad provides at least the same operations as the Identity
monad; this definition says that SetM is a monad that provides those
operations, plus three additional operations: "prompt PZero", "prompt
$ PRestrict x", and "prompt $ PPlus x y" of the types shown in the
definition of "OrdP".

You can then interpret those operations however you want; runSetM
defines an observation function that runs the computation and returns
its results in a Set, given the restriction that the computation
itself returns an Ord type.

In order to really understand this, you need to understand the type of
"runPromptC":

runPromptC ::
    (r -> ans)  -- "pure" result handler
    -> (forall a. p a -> (a -> ans) -> ans) -- "side effect" handler
that gets a continuation
    -> Prompt p r -- computation to run
    -> ans

"runPromptC" is (almost) just the case operation for a structure of this type:

data Prompt p r =
    Return r
    | forall a. BindEffect (p a) (a -> Prompt p r)

except with the recursive call to runPromptC inlined within
BindEffect; given this data type you can define runPromptC easily:

runPromptC ret _ (Return r) = ret r
runPromptC _ prm (BindEffect p k) = prm p (\a -> runPromptC ret prm (k a))

This definition makes it obvious that the "pure" continuation "ret" is
called at the end of the computation, and the "effectful" continuation
prm is called to handle any side effects.

Exercise 1: Define the function "prompt :: p a -> Prompt p a" on this datatype.
Exercise 2: Define an instance of Monad for this datatype.

Now you should be able to understand the observation function "runSetM":

> runSetM :: Ord r => SetM r -> S.Set r
> runSetM = runPromptC ret prm . unRecPrompt where
>    -- ret :: r -> S.Set r
>    ret = S.singleton
>    -- prm :: forall a. OrdP SetM a -> (a -> S.Set r) -> S.Set r
>    prm PZero _ = S.empty
>    prm (PRestrict m) k = unionMap k (runSetM m)
>    prm (PPlus m1 m2) k = unionMap k (runSetM m1 `S.union` runSetM m2)

"ret" handles the result of pure computations; that is, those that
could have just as easily run in the Identity monad.  "prm" handles
any effects; in this case the three effects "PZero", "PRestrict" and
"PPlus".

You could write a different observation/interpretation function that
treated the elements as a List, or Maybe, or whatever.

Let me know if this makes sense, or if you have any other questions.

  -- ryan