Working character by character in Haskell

Malcolm Wallace Malcolm.Wallace@cs.york.ac.uk
Fri, 19 Oct 2001 10:44:55 +0100


"Simon Marlow" <simonmar@microsoft.com> writes:

> > Well, in Haskell each character of the string takes 20 bytes: 12 bytes
> > for the list cell, and 8 bytes for the character itself 

Ahem, _Haskell_ mandates no such thing.  Perhaps you are talking
about a specific implementation?  ghc probably.

> Isn't it possible to optimize this, e.g. by embedding small data
> directly in the cons cell?

I believe Hugs does at least part of this - ASCII characters and small
ints < 256 are embedded in the tag word, so a character takes only 4
bytes, not 8.  In addition, some implementations distinguish pointers
and small values by tags, so the character can appear literally in
the list cell, rather than needing a pointer and separate allocation.

Hence some implementations only require 12 bytes per list cell +
character, not 20.

Regards,
    Malcolm