[newbie] UTF-8

Marcin 'Qrczak' Kowalczyk qrczak@knm.org.pl
Mon, 11 Aug 2003 08:41:53 +0200


Dnia pon 11. sierpnia 2003 00:49, Wolfgang Jeltsch napisał:

> The main problem is that you need binary I/O. Haskell 98 only provides text
> I/O.

You don't need binary I/O for UTF-8 now; because implementations use 
ISO-8859-1, UTF-8 octets can be faked as characters by (chr . fromIntegral).

> The other point with text I/O is that under Windows the EOF character ^Z is
> treated specially and a conversion between Windows EOLs (^M^J) and Haskell
> EOLs (^J) takes place.

UTF-8 preserves ASCII and doesn't use ASCII bytes for non-ASCII characters,
so the situation is the same as in other encodings and text mode is usually 
fine. It would not be OK for UTF-16.

-- 
   __("<         Marcin Kowalczyk
   \__/       qrczak@knm.org.pl
    ^^     http://qrnik.knm.org.pl/~qrczak/