Unicode support

Dylan Thurston dpt@math.harvard.edu
Sat, 6 Oct 2001 00:47:39 +0900


On Sun, Sep 30, 2001 at 11:01:38AM -0700, John Meacham wrote:
> seeing as how the haskell standard is horribly vauge when it comes to
> character set encodings anyway, I would recommend that we just omit any
> reference to the bit size of Char, and just say abstractly that each
> Char represents one unicode character, but the entire range of unicode
> is not guarenteed to be expressable, which must be true, since haskell
> 98 implementations can be written now, but unicode can change in the
> future. The only range guarenteed to be expressable in any
> representation are the values 0-127 US ASCII (or perhaps latin1)

I agree about the vagueness, but I believe the Unicode consortium has
explicitly limited itself to 21 bits; if they turn out to have been
lying about that (which seems unlikely in this millenium), we can
hardly be blamed for believing them.  I think all that should be
required of implementations is that they support 21 bits.

Best,
	Dylan Thurston