[Haskell-cafe] Fwd: Semantics of iteratees, enumerators, enumeratees?

C. McCann cam at uptoisomorphism.net
Tue Aug 24 03:54:27 EDT 2010


On Mon, Aug 23, 2010 at 11:41 PM, wren ng thornton <wren at freegeek.org> wrote:
> I believe the denotation of an iteratee is the transition function for an
> automaton (or rather a transducer). I hesitate to speculate on the specific
> kind of automaton without thinking about it, so maybe finite, maybe
> deterministic, but then again maybe not.

An iteratee is indeed an automaton, specifically one in an unknown
(but non-terminal) state. Consider the types in the "iteratee"
package:

    newtype IterateeG c el m a = IterateeG (StreamG c el -> m (IterGV c el m a))

    data IterGV c el m a = Done a (StreamG c el)
					 | Cont (IterateeG c el m a) (Maybe ErrMsg)

    data StreamG c el = EOF (Maybe ErrMsg) | Chunk (c el)

Abbreviating a bit and inlining the auxiliary data types:

    Iteratee c e m a = Iteratee
        ( Either (Maybe ErrMsg) (c e)
          -> m ( Either (a, Either (Maybe ErrMsg) (c e))
                        (Iteratee c e m a, Maybe ErrMsg) ) )

Although the "stream" bits--which actually represent a single chunk of
input--are self-contained, so it might clarify things to parameterize
the iteratee over the entire chunk type, subsuming the "c" and "e"
parameters:

    Iteratee s m a = Iteratee (s -> m (Either (a, s) (Iteratee s m a,
Maybe ErrMsg)))

In practice you wouldn't want to do that because you want the "c" and
"e" parameters to be readily available. Perhaps a type family would
make more sense here for the type now called "s"? There's also the
matter of errors in the input stream, but that doesn't really impact
the underlying structure in a significant way.

We have a type parameter "m :: * -> *", which sounds suspiciously like
an intended monad. It's wrapping an Either value, which amounts to
just EitherT, like the one in category-extras.

    Iteratee s m a = Iteratee (s -> EitherT (a, s) m (Iteratee s m a,
Maybe ErrMsg))

A function to a monadic value is just a Kleisli arrow.

    Iteratee s m a = Iteratee (Kleisli (EitherT (a, s) m)) s (Iteratee
s m a, Maybe ErrMsg)

Which sets things up to use the Automaton transformer in the "arrows" package.

    Iteratee s m a = Automaton (Kleisli (EitherT (a, s) m)) s (Maybe ErrMsg)

The Automaton type describes a Mealy-style stream transducer where the
underlying arrow combines the transition function and state, and the
input and output to the arrow are the per-step input and output of the
automaton.

The iteratee automaton here produces only a stream of (Maybe ErrMsg)
as output, so it really isn't much of a transducer. EitherT describes
a computation that can be cut short, which in this case essentially
augments the automaton with an explicit "halt" state.

So, we have:

- An Iteratee describes a running state machine, paused at an outgoing
transition, awaiting another chunk of input.
- After receiving input, the Iteratee does one of two things:
    - Halt, returning unused input and a final result value.
    - Return an action in the underlying monad, containing the
post-transition state machine and an optional error message.
- The Iteratee type is parameterized by three types: a single chunk of
input, an underlying monad, and a final result value; generic iteratee
functions are thus independent of any of those.

The "Enumerator" types are the other half of the system: an arbitrary
data source that sits there and turns the crank, feeding in chunks of
data, until it decides to stop.

The types in the "enumerator" package follow almost the same scheme,
but with things rotated around a little bit: What it calls Iteratee is
a monadic action, representing a state machine paused at an ingoing
transition, which will yield either an outgoing transition function, a
halting state with a final result, or an error.

What sets an iteratee-style design apart from something conventional
based on a State monad is that the iteratee conceals its internal
state completely (in fact, there's no reason an iteratee even has to
be the "same" function step-to-step, or have a single consistent
"state" type--almost has an existential flavor, really), but is at
another function's mercy when it comes to actually doing anything.

All of which doesn't really shed too much light on the denotation of
these things, I suppose, as there's barely anything there to talk
about; the iteratee automaton itself is a terribly simple construct,
relying on an underlying monad to perform actions, on an external
"push" data source to recurse, and being given only bite-size chunks
of data at each step. It's little more than foldl with a "pause"
button attached.

- C.


More information about the Haskell-Cafe mailing list