[Haskell-cafe] Parsers are monadic?

Sat Jun 30 13:11:51 EDT 2007

> Have you used Parsec? 

i read about it when it came out, but i've always defined my own 
combinators. in case you wonder, there are two reasons for this: 

(a) the approximation of parsers as monads is close enough that a simple 

    type Parser m a = StateT String m a

    gives us the basic combinators (sequence,success,failure,alternative) for free.

(b) i like my combinator grammars to be reversible, so that a single grammar
    specification can be used for both parsing and unparsing/pretty-printing. 
    that means i have to define the details myself anyway.

about the only thing that spoils this nice setup is error handling. i've done 
some of the things that parsec does myself, such as labeling productions,
keeping positions and unexpected input, and merging of error info from 
branches, but it just doesn't seem to fit easily: what starts out as an 
elegant reuse of Monad/MonadPlus/StateT soon looks rather involved 
(compare also the implementation description in the parsec paper). 
parsec is a bit more systematic about these things, so it is easier to 
see where it is headed: a three-valued logic. 

if one has successful parses, "successful" errors, and failure to produce 
either as the third outcome, then one can try to merge the success and 
failure continuations into the single continuation provided by the monadic 
(>>=). still not very natural/simple, as one always has to deal with both
continuations at once, the same way, but perhaps workable.

> Negation pretty much just works. 

please keep in mind that i was asking for general monadic negation. 
if one introduces a bias towards the first successful parse (as parsec
does, apart from longest-match), one can model negation as failure:

    not m = (m >> return False) `mplus` return True

this only uses Monad and MonadPlus, but it only works as expected
for some MonadPlus (those which commit locally to the first success, 
instead of collecting multiple successes, such as lists, or searching for 
global success, such as amb, or longest-match parsers). 

alternatively, one could require an overloaded 'first' method, selecting 
the first successful branch in an ordered collection of solutions:

    not' m = first $ (m >> return False) `mplus` return True

but that wouldn't work for unordered collection monads.

in other words, one can define monadic negation for some specific
monads, but not for all monads. but coding implementation knowledge
about any specific monad into the source makes the code less general
than it could be. 

so i was wondering whether there should be a class such as 
MonadChoice (committed choice instead of collections; every
MonadChoice gives a MonadPlus, but not vice-versa), or 
MonadFirst (giving access to the first success), or MonadNegate
(providing monadic not by whatever means). then code depending
on monadic negation could still be reasonably general (not working 
with arbitrary monads, but also not restricted to a specific one).

btw, in an approach with two continuations, negation is as simple 
as switching the success and failure continuations. but the merging
of success and failure branches is perhaps no less involved than
what one gets when implementing the two-kinds-of-success 
approach. 

i was just hoping that someone had come up with something more 
elegant, and i was wondering whether there were other issues that 
people have to work around again and again.

claus