*Ross Paterson,
10th September 1999 (revised 22nd May 2000).*

John Hughes
has defined a new abstract view of computation,
in his paper *Generalising Monads to Arrows*
(old draft).
That paper uses a point-free style.
This proposal introduces a notation for arrows that resembles
that used for monads.
The semantics is defined by translation into the core language.
Familiarity with the arrows paper is assumed.
Comments welcome.

**New:**
A prototype implementation exists.

This proposal supersedes the initial version, which was a bit lighter, but was unable to cope with operators. There is a paper discussing the notation presented here, including what the arrow laws look like. This and other stuff on arrows may be found on my arrows page.

The proposal is laid out in three stages. The first stage is the largest (and most fully described); it would be useful by itself. The others add further convenience.

In general, the proposal introduces new things called *commands*,
with a similar syntax to expressions,
but whose meaning is given by a translation to quite different expressions.
Thus inside an arrow expression there are seemingly familiar things like
`\`, `if`, `case` and `do`,
with similar, but different, meaning to what you expect.
It's a matter of opinion how much of this `abuse of notation' is helpful.

The first level of sugar gives something similar to the raw monad syntax.
For example,
here is a reformulation of the interpreter from the arrows paper
(think of ``proc`' as a variant of lambda,
and ``-<`' (arrow tail) as a sort of application):

eval (Var s) = proc env -> returnA -< fromJust (lookup s env) eval (Add e1 e2) = proc env -> (eval e1 -< env) `bind` \ ~(Num u) -> (eval e2 -< env) `bind` \ ~(Num v) -> returnA -< Num (u + v) eval (If e1 e2 e3) = proc env -> (eval e1 -< env) `bind` \ ~(Bl b) -> if b then eval e2 -< env else eval e3 -< env eval (Lam x e) = proc env -> returnA -< Fun (proc v -> eval e -< (x,v):env) eval (App e1 e2) = proc env -> (eval e1 -< env) `bind` \ ~(Fun f) -> (eval e2 -< env) `bind` \ v -> f -< v

The above code makes use of the following (standard Haskell) definitions:

returnA :: Arrow a => a b b returnA = arr id bind :: Arrow a => a b c -> a (b,c) d -> a b d e `bind` f = returnA &&& e >>> f

New reserved words:
`proc`, `-<` and `form`.
(Can anyone think of better names?)

Add the productions

exp-> ... | procapat->cmdcmd->fexp-<aexp(Arrow application) | \apat..._{1}apat->_{n}cmd(kappa-abstraction) | formaexpcmd..._{1}cmd(control operator, n >= 0) |_{n}cmd_{1}qopcmd(short for `_{2}form(qop)cmd_{1}cmd') | let_{2}declsincmd(local definitions) | ifexpthencmdelsecmd| caseexpof {pat->_{1}cmd; ... ;_{1}pat->_{n}cmd[;] } | (_{n}cmd) (same as `cmd')

In contrast to monads,
it is necessary to treat conditionals specially,
as will become clear in the translation.
(I have omitted guards from the `case` for simplicity.
Adding them doesn't raise any extra issues.)

The above grammar is vague about precedence.
I would expect arrow application to bind more tightly than
`let` and conditionals,
but perhaps more loosely than infix control operators
and control operator application.

The following equations define a naive translation into current Haskell.
There is plenty of scope for improvement,
for example by assuming that `arr` preserves composition,
dead variable elimination, simplifying patterns and so on.

The first step is to eliminate head patterns that could fail. These are replaced by failure-free patterns (defined as in Haskell 1.4):

procp->c= arr (\x -> case x of {p-> Left (Flatten(p)); _ -> Right x }) >>> (procFlatten(p)-> c) ||| zeroArrow

where *Flatten(p)* is a failure-free pattern with the same irrefutable
subpatterns as *p*.
One way to do this is by replacing all non-unique constructors with tuples,
e.g. *Flatten(y:ys)* could be `(y,ys)`.

(Note that this translation requires an arrow in both `ArrowChoice`
and `ArrowZero`.
A Haskell-98-style definition would only eliminate the `ArrowZero`,
while changing the `Arrow` class.)

From here on, the head pattern is assumed to be failure-free. The translation rules are

procp->f-<e= arr (\p->e) >>>fifVars(f)andVars(p)disjoint = arr (\p-> (f,e)) >>> app otherwise procp-> \p..._{1}p->_{n}c= proc (...(p,p), ..._{1}p) ->_{n}cprocp-> formec..._{1}c=_{n}e(procp->c) ... (proc_{1}p->c) proc_{n}p-> letdeclsinc= arr (\ p -> letdeclsin (p,p')) >>>Vars(p') = Defvars(decls)(proc (p,p') ->c) procp-> ifethencelse_{1}c= arr (\_{2}p-> ifethen Leftpelse Rightp) >>> (procp->c) ||| (proc_{1}p->c) proc_{2}p-> caseeof {p->_{1}c; ... ;_{1}p->_{n}c} = arr (\_{n}p-> caseeofp-> Left (_{1}p,Flatten(p) ..._{1})p-> Right_{n-1}^{n-2}(Left (p,Flatten(p))_{n-1})p-> Right_{n}^{n-1}(p,Flatten(p)) >>> (proc (_{n})p,Flatten(p) ->_{1})c) ||| ... ||| (proc (_{1}p,Flatten(p) ->_{n})c)_{n}

Some points to note:

- The first translation of arrow application should be used if possible,
as the other one requires that the arrow belong to
`ArrowApply`. - Control operators operate on commands (not values) to define new commands.
- Unfortunately, variables defined by
`let`are monomorphic. (This can be fixed with a messy translation using first class polymorphism.) - The conditionals
`if`and`case`require that the arrow belong to`ArrowChoice`. - In the translation of
`case`, I have assumed that`|||`is right-associative.

A control operator expression (the *aexp* after `form`)
must not mention variables defined within
the `proc` expression, and must have type

a_{1}s_{1}t-> ..._{1}a_{n}s_{n}t->_{n}ast(n >= 0)

where

- the
*a*'s are arrows (possibly different), - the
*s*'s are either of the form`(`or the same type variable*s*,*t*)*b*, - none of the
*t*'s mention*b*.

(They should also have the corresponding naturality property, but that's the programmer's responsibility. All this is required to enable certain transformations, including the optimizations mentioned above.)

For example, `bind` above has a type of this form.
Some other examples are

(&&&) :: Arrow a => a b c -> a b d -> a b (c,d) zero :: ArrowZero a => a b c (<+>) :: ArrowPlus a => a b c -> a b c -> a b c loop :: ArrowLoop a => a (b,d) (c,d) -> a b c

Many more can be defined.

The above proposals create a syntax similar to the raw monadic style.
Indeed, the rules defining `do`-notation can be readily adapted
to this syntax, which making it a little more pleasant.
Then the interpreter example becomes

eval (Var s) = proc env -> returnA -< fromJust (lookup s env) eval (Add e1 e2) = proc env -> do ~(Num u) <- eval e1 -< env ~(Num v) <- eval e2 -< env returnA -< Num (u + v) eval (If e1 e2 e3) = proc env -> do ~(Bl b) <- eval e1 -< env if b then eval e2 -< env else eval e3 -< env eval (Lam x e) = proc env -> returnA -< Fun (proc v -> eval e -< (x,v):env) eval (App e1 e2) = proc env -> do ~(Fun f) <- eval e1 -< env v <- eval e2 -< env f -< v

Add another reserved word `rec` and the productions

cmd-> ... | do {stmt; ..._{1}stmt;_{n}cmd[;] }stmt->cmd|pat<-cmd| letdecls| rec {stmt; ..._{1}stmt[;] }_{n}

The translations of the first three are much the same as ordinary
`do`-notation:

do {c;rest} =c`bind` \ _ -> do {rest} do {p<-c;rest} =c`bind` \p-> do {rest} do { letdecls;rest} = letdeclsin do {rest}

The above rules yield commands, which are then translated using the previous rules.

The final kind of statement has no counterpart in ordinary
`do`-notation,
though such an extension was proposed in
O'Haskell.
It is translated by

do { rec {ss};rest} = loop (\p-> do {ss; returnA -< (p,p)}) `bind` \p-> do {rest}

where *p* is a pattern collecting the variables defined by *ss*.
(I'm not sure whether variables defined by `let` definitions should
be included here.)

The duplication of *p* in the above rule looks redundant,
but in practice a more efficient translation will send different tuples
of variables to the feedback and output.

John Hughes has suggested that one allow definitions like

f p_{1}-< q_{1}= c_{1}... f p_{n}-< q_{n}= c_{n}

(though he uses a different symbol than `-<`.

These could be translated as

f x = proc y -> case (x,y) of (p_{1}, q_{1}) -> c_{1}... (p_{n}, q_{n}) -> c_{n}

(The generalizations to more arguments, and to guards, are straightforward.)
If there were only one clause, and all patterns were failure-free,
the `case` could be avoided.
Otherwise, the arrow must belong to `ArrowChoice`.
(One might think that discriminating between the *p*'s
could be moved out of the arrow.
But this transformation would be invalid for some arrows,
in particular, for stream processors.)

John gave the following example:

(f ||| g) << Left x = f << x (f ||| g) << Right y = g << y

which is certainly a big improvement over

f ||| g = proc z -> case z of Left x -> f x Right y -> g y

On the other hand, if all the *q _{i}* were failure-free,
the following translation would probably be preferable:

f p_{1}= proc q_{1}-> c_{1}... f p_{n}= proc q_{n}-> c_{n}

This discontinuity is a bit unpleasant.

John further suggests that `proc`
be replaced by ``\ -<`',
but maybe that's going too far.