UTF-8 library

Ashley Yakeley ashley@semantic.org
Thu, 8 Aug 2002 02:59:33 -0700


At 2002-08-08 02:56, Patryk Zadarnowski wrote:

>1990 ANSI/ISO C requires chars to be at least 8 bits wide in section
>5.2.4.2.1. This extends to ANSI/ISO C++, which cites ISO C for its
>definition of <limits.h>. Haven't got C'99 handy, but it'll be a
>similar story - char is *at least* 8 bits wide.
>
>Hope that satisfies the pedants.

But there's no guarantee that char is exactly 8 bits wide, is there? So 
it's appropriate to have separate types, CChar/CSChar/CUChar and 
Word8/Int8.

I submit that interfaces to network handles should all use Word8, because 
the standards all speak unambiguously of "octets". For files it's 
debatable perhaps, but I strongly lean towards Word8 for any "binary" 
(i.e. direct) file access. CChar/CSChar/CUChar are for FFI to C and C++.

-- 
Ashley Yakeley, Seattle WA