[Haskell-cafe] Re: String vs ByteString

Donn Cave donn at avvanta.com
Sat Aug 14 11:49:02 EDT 2010


Quoth Brandon S Allbery KF8NH <allbery at ece.cmu.edu>,
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 8/14/10 01:29 , Kevin Jardine wrote:
>> I think that this kind of programming detail should be handled
>> internally (even if necessary by switching automatically from UTF-8 to
>> UTF-16 depending upon the language).

It seems like the right thing, described in the wrong words - wouldn't
it be a more sensible ideal, to simply `switch' depending on the
character encoding?

I mean, to start with, you'd surely wish for some standardization,
so that the difference between UTF-8 and UTF-16 is essentially internal,
while you use the same API indifferently.

Second, a key requirement to effectively work with external data is
support for multiple character encodings.  E.g., if Text is internally
UTF-16, it still must be able to input and output UTF-8, and presumably
also UTF-16 where appropriate.

So given full support for _both_ encodings (for example, Text
implementation for `native' UTF-8), and support for input data of
_either_ encoding as encountered at run time ... then the internal
implementation choice should simply follow the external data.  For
Chinese inputs you'd be running UTF-16 functions, for French UTF-8.

	Donn Cave, donn at avvanta.com


More information about the Haskell-Cafe mailing list