UTF-8 library

Fergus Henderson fjh@cs.mu.OZ.AU
Fri, 9 Aug 2002 18:12:47 +1000


On 06-Aug-2002, George Russell <ger@tzi.de> wrote:
> 
> Converting CStrings to [Word8] is probably a bad idea anyway, since there is
> absolutely no reason to assume a C character will be only 8 bits long, and
> under some implementations it isn't. 

That's true in general; the C standard only guarantees that a C character
will be at least 8 bits long.

But Posix now guarantees that C's `char' is exactly 8 bits.

Posix hasn't taken over the world yet, and doesn't look like doing so
in the near future.  So Haskell should not limit itself to being only
implementable on Posix systems.  However, systems which don't have 8-bit
bytes are getting very very rare nowadays -- it might well be reasonable
for Haskell, like Posix, to limit itself to only being implementable
on systems where C's `char' is exactly 8 bits.

-- 
Fergus Henderson <fjh@cs.mu.oz.au>  |  "I have always known that the pursuit
The University of Melbourne         |  of excellence is a lethal habit"
WWW: <http://www.cs.mu.oz.au/~fjh>  |     -- the last words of T. S. Garp.