[Haskell-cafe] Re: Strings and utf-8

Maurí­cio briqueabraque at yahoo.com
Wed Nov 28 14:38:31 EST 2007


 >>(...)  When it's phrased as "truncates to 8
 >> bits" it sounds so simple, surely all we need
 >> to do is not truncate to 8 bits right?
 >>
 >> The problem is, what encoding should it pick?
 >> UTF8, 16, 32, EBDIC? (...)
 >>
 >> One sensible suggestion many people have made
 >> is that H98 file IO should use the locale
 >> encoding and do Unicode/String <-> locale
 >> conversion. (...)

I'm really afraid of solutions where the behavior
of your program changes with an environment
variable that not everybody has configured
properly, or even know to exist.

 > Wouldn't it be sensible not to use the H98 file
 > I/O operations at all anymore with binary files?
 > A Char represents a Unicode code point value and
 > is not the right data type to use to represent a
 > byte from a binary stream.

That seems nice, we would not have to create a
"wide char" type just for Unicode.

This topic made me search the net for that nice
quote:

"Explanations exist: they have existed for all
times, for there is always an easy solution to
every problem — neat, plausible and wrong."

(See: en.wikiquote.org/wiki/H._L._Mencken
That guy has many quotes worth reading.)

Strings as char lists is a very good example of
that. It's simple and clean, but strings are not
char lists in any reasonable sense.

Best,
Maurício



More information about the Haskell-Cafe mailing list