[Haskell-cafe] Fun with ByteStrings [was: A very edgy language]

Malte Milatz malte at gmx-topmail.de
Sun Jul 8 10:38:19 EDT 2007


Tillmann Rendel:
> As I understand it (wich may or may not be correct):
> 
> A normal Haskell string is basically  [Word8]

Hm, let's see whether I understand it better or worse.  Actually it is
[Char], and Char is a Unicode code point in the range 0..1114111 (at
least in GHC).  Compare:

	Prelude Data.Word> fromEnum (maxBound :: Char)
	1114111
	Prelude Data.Word> fromEnum (maxBound :: Word8)
	255

So it seems that the Char type abstracts the encoding away.  I'm
actually a little confused by this, because I haven't found any means to
make the I/O functions of the Prelude (getContents etc.) encoding-aware:
The string "ä", when read from a UTF-8-encoded file via readFile, has a
length of 2.  Anyone with a URI to enlighten me?

Malte



More information about the Haskell-Cafe mailing list