[Haskell-cafe] Efficient string construction

Ketil Malde ketil at malde.org
Fri Jun 4 02:28:16 EDT 2010


Daniel Fischer <daniel.is.fischer at web.de> writes:

>> So why is there a UTF8 implementation for bytestrings? Does that not
>> duplicate what Text is trying to do? If so, why the duplication?

> I think Data.ByteString.UTF8 predates Data.Text.

One difference is that Data.Text uses UTF-16 internally, not UTF-8.

>> When is each library more appropriate?

Much data is overwhelmingly ASCII, but with an option for non-ASCII in
comments, labels, or similar.  E.g., for biological sequence data, files
can be large (the human genome is about 3GB) and non-ascii characters
can only occur in sequence headers which constitute a miniscule fraction
of the total data.  So I use ByteString for this.

-k
-- 
If I haven't seen further, it is by standing in the footprints of giants


More information about the Haskell-Cafe mailing list