[Haskell-i18n] Surrogate pairs?

Ketil Z. Malde ketil@ii.uib.no
21 Aug 2002 09:17:11 +0200


Ashley Yakeley <ashley@semantic.org> writes:

> That's not quite correct. Every code point is exactly one Char, but some=
=20
> characters may be composed of more than one code point. For instance, '=
=E1'=20
> might be represented as
>=20
>   \#00E1 [LATIN SMALL LETTER A WITH ACUTE]

> or

>   \#0061 [LATIN SMALL LETTER A] + \#0301 [COMBINING ACUTE ACCENT]

I guess they must be treated the same, too?  That is, the length of
the strings should be the same, they should compare equal, etc etc.

Or is it an alternative to just ignore the issue, and simply think of
the latter as two characters?

-kzm
--=20
If I haven't seen further, it is by standing in the footprints of giants