More Unicode nit-picking

Marcin 'Qrczak' Kowalczyk qrczak@knm.org.pl
19 Oct 2001 11:33:15 GMT


19 Oct 2001 06:09:09 +0100, Colin Paul Adams <colin@colina.demon.co.uk> pisze:

> But this seems to assume there is a one-to-one mapping of upper-case
> to lower-case equivalent, and vice-versa. Apparently this is not so.

Indeed, but there exists a default locale-independent case mapping.
Language-specific mappings and irregular cases are described as
deviations from it.

> It seems that whilst the Unicode database's definitions of whether or
> not a character is upper/lower/title case are normative, the mappings
> from upper to lower case are only suggestive.

Yes.

> what should the primUnicodeToLower/ToUpper operations actually do?
> Should they be locale sensitive?

I don't think so: they are pure functions, so IMHO they should have
a definition independent from the environment. Especially as a proper
case mapping requires mapping a character to a few characters sometimes,
so a more exact definition would require a different interface: either
String -> IO String or IO (String -> String), perhaps taking the locale
into account explicitly.

-- 
 __("<  Marcin Kowalczyk * qrczak@knm.org.pl http://qrczak.ids.net.pl/
 \__/
  ^^                      SYGNATURA ZASTĘPCZA
QRCZAK