Unicode support in Hugs - alpha-patch available

Simon Marlow simonmar@microsoft.com
Tue, 26 Aug 2003 17:10:22 +0100


=20
> How do we implement the conversion functions?  The approach recently
> added to the CVS version of GHC is to use the native libraries, which
> requires the user to set the locale appropriately.  More generally,
> should these functions be locale-dependent at all?

No they shouldn't be locale-dependent, but unfortunately that's what the
C library gives us at the moment.  wchar_t should ideally represent
Unicode, but sadly it doesn't (with glibc) unless you set the locale to
something/anything other than "C" or "POSIX".

The situation is worse on Solaris: I had to set the locale to
en_US.UTF-8 before I got correct results.  I haven't tried FreeBSD yet.
Windows gives the correct results without having to muck around with
locales (but not if you use cygwin).

So GHC's solution is patchy at the moment, but I hope the situation will
get better in the future as more OSs jump on the wchar_t=3D=3DUnicode
bandwagon.

Cheers,
	Simon