darcs patch: Add UTF8 converting and outputing functions

Ross Paterson ross at soi.city.ac.uk
Sun Oct 22 18:50:46 EDT 2006


On Sun, Oct 15, 2006 at 01:56:07AM +0900, mukai at jmuk.org wrote:
> In GHC 6.6, the source code can include UTF-8 characters and converted
> them to Unicode chars.  However, there are no (easy) way to convert
> them from/to UTF-8 in standard libraries of GHC.

As others have pointed out, the conversion functions are not GHC-specific:

	charToUTF8Chars :: Char -> [Word8]
	toUTF8String :: String -> [Word8]
	fromUTF8String :: [Word8] -> String

The fromUTF side probably also needs a way to report illegal encodings
and incomplete encodings.

As for I/O part, your implementation assumes that hPutChar writes a byte
to a Handle, which is currently the case in GHC, but this is arguably
a bug, and it's not the case in Hugs and Jhc.  I think we need to work
out a plan for Unicode I/O in Haskell, and then work towards that.
For the current state, see

http://haskell.galois.com/cgi-bin/haskell-prime/trac.cgi/wiki/CharAsUnicode



More information about the Libraries mailing list