CWString

Simon Marlow simonmar at microsoft.com
Tue Aug 26 07:48:03 EDT 2003


 
> Attached is a properly internationalized implementation of
> Foreign.C.String, along with some other routines which I feel would be
> very at home in the FFI standard.

I support this proposal.

> Note that I am trying to solve a simpler problem than full 
> generic i18n.
> I just want the ability to work within the current locale, whatever it
> might be. I have tested these routines in utf8, latin1, greek, korean
> and a few other locales. they seem to work well.
> 
> in addition to properly localeizing withCString, peek/pokeCString and
> friends I feel it is important to provide routines to work on 
> wchar_t *
> strings. there are a number of reasons:
> 
>  * if __STDC_ISO_10646__ is defined (which is almost always),
>    conversions can be incredibly optimized, in particular an array 
>    of Chars can be implemented directly as an array of wchar_t's

In our new implementation of Data.Char.isUpper and friends, I made the
simplifying assumption that Char==wchar_t==Unicode.  With glibc, this
appears to be valid as long as (a) you set LANG to something other than
"C" or "POSIX", and (b) you call setlocale() first.   We now call
setlocale() in the RTS startup code.

I did try using libunicode, but it appears that libunicode only
understands Unicode characters up to 0xffff.  That was version 0.4,
perhaps more recent versions are better.

There's a typo in the !GLASGOW_HASKELL case for peekCWString.  Also, you
can't use #def in library code (for boring reasons), use a separate C
file instead.

Cheers,
	Simon




More information about the FFI mailing list