H98: unicode

Simon Peyton-Jones [email protected]
Fri, 31 Aug 2001 01:21:40 -0700


Thanks to Mark and Marcin for reminding me of one other H98 point.
The proposal is:

     In the lexical syntax for 'symbol', replace

  	 symbol    -> ascSymbol | uniSymbol
	symbol  -> ascSymbol | uniSymbol_<special | _ | : | " | '>

The reasoning is in the message below.  Marcin agreed that the change is
correct.  I'll adopt it unless anyone yells.


-----Original Message-----
From: [email protected] [mailto:[email protected]] On
Behalf Of Mark P Jones
Sent: 23 July 2001 19:24
To: [email protected]
Cc: Marcin 'Qrczak' Kowalczyk; Mark P. Jones
Subject: Picky details about Unicode (was RE: Haskell 98 Report possible
errors, part one)

| 2.2. Identifiers can use small and large Unicode letters ...

If we're picking on the report's handling of Unicode, here's another
minor quibble to add to the list.  In describing the lexical syntax of
operator symbols, the report uses:

   varsym    -> (symbol {symbol | :})_<reservedop>
   symbol    -> ascSymbol | uniSymbol
   uniSymbol -> any Unicode symbol or punctuation

The last line seems to include more characters than I'd expect.

  ()[]{}  are punctuation (Unicode type Pe, Ps)
  `       is a symbol, modifier (Unicode type Sk)
  "':;,   are punctuation, other (Unicode type Po)
  _       is punctuation, connector (Unicode type Pc)

And, so, if I read the report correctly, I should be able to define :-)
as a consym and `div`, [], and "hello" as varsyms! (Not to mention some
altogether more bizarre choices!)

I guess the intention here is that:

  symbol  -> ascSymbol | uniSymbol_<special | _ | : | " | '>

In fact, since all the characters in ascSymbol are either punctuation or
symbols in Unicode, the inclusion of ascSymbol is redundant, and a
better specification might be:

  symbol  -> uniSymbol_<special | _ | : | " | '>

All the best,

P.S.  A caveat: I'm not a Unicode expert!  Perhaps Marcin can advise ...

Haskell mailing list
[email protected] http://www.haskell.org/mailman/listinfo/haskell