Happy and Macros (was Re: ANNOUNCE: Happy 1.10 released)

Brian Boutel [email protected]
Sat, 12 May 2001 16:57:34 +1200


"Carl R. Witty" wrote:
> 
> "Manuel M. T. Chakravarty" <[email protected]> writes:
> 
> > I don't think, the point is the test for non-ambiguity.  At
> > least, Doitse's and my self-optimising parser combinator
> > library will detect that a grammar is ambigious when you
> > parse a sentence involving the ambiguous productions.  So,
> > you can check that by parsing a file involving all grammar
> > constructs of the language.
> 
> Sorry, doesn't work.  Where do you get this "file involving all
> grammar constructs of the language"?
> 
> If such an approach worked, you could use it to determine whether an
> arbitrary context-free grammar was ambiguous; but this problem is
> undecidable.
> 

This illustrates the difference between generality and usefulness.

Often, one is less interested in learning that a grammar is ambiguous
than learning that it is not. 

If you have a parser generator for a class of grammars, it will tell you
if (and probably why) a candidate grammar you feed to it is not a member
of that class. If it is accepted by, for example, a parser generator for
LR(1) grammars, then it is a DCFG and therefore unambiguous.

If you start with a "natural" grammar for the language, i.e. one that
corresponds to the abstract syntax, and use a generator for a broad
class of grammars (e.g LR(1)) then failure will give a good hint that
the only way to get an unambiguous grammar in that class is to restrict
the language, and you can use that information to make design decisions.

For example, although Haskell has both 'let' and 'where' for introducing
local definitions, thereby reflecting the typical committee decision
when there are varying stylistic preferences, 'where' is not part of the
expression syntax. If you write a grammar which includes the "natural"
productions

exp -> let defs in exp
exp -> exp where defs

and put this through a suitable generator, you will see why not.

Of course, it is possible that an unambiguous grammar will fail to be
LR(1) - by being non-deterministic, so the proper conclusion is that "G
is ambiguous or non-deterministic". But that is close enough for most
purposes.

Early versions of Hope had both 'let' and 'where' as expressions, and
had three different forms of condtional. Most combinations of these
constructs would interract to create ambiguity. The hand-coded parsers
in use had not picked this up. That shows the advantage of using a
generator - you get a constructive proof that the grammar is in the
desired class.

--brian