---> being mislexed at the start of a line?

Simon Marlow simonmar at microsoft.com
Tue Feb 24 10:31:02 EST 2004


> Now, this should be lexed as a varsym, and looking through
> compiler/parser/Lexer.x, you see the single line comment parsing code:
> 
> line 111:
>         "--"\-* ([^$symbol] .*)?        ;
> 
> The associated comment:
>         The regex says: "munch all the characters after the dashes, as
>         long as the first one is not a symbol".
> 
> But I read the whole regex as:
>         "Munch dashes. Also munch all characters after the dashes, as
>         long as the first one is not a symbol".
> 
> There's the bug, I think. So none of these will work either: --> --<
> -->=  etc. So the regex might need to explicitly match 
> varsyms, and rule
> them out first. A quick hack:
> 
>         "--"\-* $symbol+        { goto varsym code }
>         "--"\-* .*              ;

Remember that Alex always picks the longest match.  So the current regex
should be correct:

         "--"\-* ([^$symbol] .*)?        ;

if the dashes are followed by a symbol, then this lexeme will be parsed
by the varsym rule, because that matches more of the input than this
regex, which only matches the dashes.

The problem is that the varsym rule isn't always valid, because we might
be in one of the other lexer states (bol or layout).  This is why the
bug only happens at the beginning of a line.

I'm working on a fix now.

Cheers,
	Simon


More information about the Cvs-ghc mailing list