UTF-8 library

Manuel M T Chakravarty chak@cse.unsw.edu.au
Thu, 08 Aug 2002 22:18:32 +1000 (EST)


Ashley Yakeley <ashley@semantic.org> wrote,

> At 2002-08-08 02:28, Manuel M T Chakravarty wrote:
> 
> >ANSI C guarantees that char is 1 byte (more precisely that
> >"sizeof (char)" == 1).
> 
> That's also what the C++ ARM says (which I have to hand). Unfortunately, 
> 
>     "a byte is undefined by the language except in terms of 
>     sizeof; sizeof(char) is 1." [sec. 5.3.2]
> 
> Maybe ANSI C is different?

As I understand it, in ANSI C, the only freedom that an
implementation has in choosing a concrete representation for
"char" is to decide whether it is signed or unsigned.  In
any case, it is going to be an 8 bit entity.

Manuel