[Haskell-cafe] Strings - why [Char] is not nice

ajb at spamcop.net ajb at spamcop.net
Mon Sep 20 21:09:30 EDT 2004


G'day all.

Quoting Henning Thielemann <iakd0 at clusterf.urz.uni-halle.de>:

> Efficiency is always a reason to mess everything.

OTOH, when efficiency matters, it REALLY matters.  (The flip side of
this is that "efficiency" doesn't always mean what you think it means.)

The problem is that the current representation causes problems with
text-heavy applications.  Assume for a moment that Char is a 16-bit
code point.  Then under GHC, on a 32 bit architecture, [Char] uses four
times as much memory as an unboxed array of Char.  While I'm not for
scrounging cycles when it's not necessary to do so, if a text-heavy
Haskell program can only handle 25% of the data of the equivalent
program on the same computer written in another language (say, ML),
then that's a problem for Haskell.

And text processing is a very large, important area these days.

> But the inefficiency
> applies to lists of every data type, so why optimizing only Strings, why
> not optimizing Lists in general, or better all similar data structures, as
> far as possible?

That's an excellent question, and it is one of the motivations behind
the various abstract container projects, which appear to have stalled.
There's something to be said for the philosophy that a String is a kind
of container, and you shouldn't care what kind of container, so long as
it obeys these axioms.

> I very like to apply List functions to
> Strings, so the definition String = [Char] seems to me the most natural
> definition.

So why would a function converting a String to a (lazy) list be
inappropriate?

Cheers,
Andrew Bromage


More information about the Haskell-Cafe mailing list