Haskell Platform Proposal: add the 'text' library

Brandon S Allbery KF8NH allbery at ece.cmu.edu
Wed Oct 20 01:10:36 EDT 2010


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 10/19/10 22:36 , wren ng thornton wrote:
> <musing>
> I almost wonder if it would be worth it to define a new type, Character,
> which does correspond 1:1 to the human notion of a "character" (being
> intentionally vague about what exactly that means). Then we could have that
> Text is a vector/list/sequence of Characters, and give it the appropriate
> interface for being thought of that way.

I believe Perl 6 is going this way; while there is a single base type Str
and role String, there are three different things it can "mean" (call them
subtypes):  bytes, Unicode code points, graphemes (the latter corresponding
to the proposed Character).  Or possibly only two of those; IIRC recently it
was proposed that the byte version be moved to the already existing Buf
type/Buffer role intended for binary data, roughly equivalent to ByteString.
 If a given string is accessed as code points, it can't then be treated as
graphemes unless re-assigned to, and vice versa, but assigning it to another
Str allows that Str to be accessed as graphemes instead.

(I think.  The Perl 6 spec is still a moving target, as evidenced by the
thing about byte access; it's entirely possible that this changed again and
I missed it.  But there was definitely thought put into the distinction
between bytes, codepoints, and graphemes.)

- -- 
brandon s. allbery     [linux,solaris,freebsd,perl]      allbery at kf8nh.com
system administrator  [openafs,heimdal,too many hats]  allbery at ece.cmu.edu
electrical and computer engineering, carnegie mellon university      KF8NH
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.10 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAky+ecwACgkQIn7hlCsL25USGgCeOQZdx4PBCjc7yF0LwSRdyYEp
E1IAniYszij4vGohwPtGOkB/weNB6TEF
=NhB/
-----END PGP SIGNATURE-----


More information about the Libraries mailing list