Marshalling Haskell String <-> UTF-8

Simon Marlow simonmar at microsoft.com
Wed Sep 1 06:09:50 EDT 2004


On 01 September 2004 10:16, Bayley, Alistair wrote:

> I want to call a foreign C function that takes a UTF-8 encoded string
> as one of its arguments (and there's also a version of the function
> that receives UTF-16). Can someone point me to documentation or
> examples of how this would be done? AFAICT (reading the FFI spec)
> marshalling a String to a CString is locale-dependent, whereas I know
> that I want UTF-8/16. 
> 
> Also, if a C function returns a UTF-8 (or UTF-16) encoded string, how
> do I marshall this reliably into a Haskell String?
> 
> Can I use the UTF-16 functions directly with CWStrings? (I'm not sure
> exactly what wchar_t is, as it's apparently dependent on the locale at
> compile-time, and could be 8, 16, or 32 bits).

Your best bet is to marshal it yourself.  We're a bit behind in this
area: 6.2.x doesn't have CAString and CWString, and CString is just
char*.  The HEAD has CAString and CWString, and will hopefully follow
the FFI spec by the time we release 6.4 (we still have to do the locale
encoding/decoding between CString and String, IIRC).

In any case, none of this allows you to specify a UTF-8 conversion.

wchar_t varies from platform to platform: on Windows it is 16 bits, on
Linux with glibc it is 32 bits, for example.  CWString is only useful
for talking to C interfaces that are expressed in terms of wchar_t.

Cheers,
	Simon


More information about the Glasgow-haskell-users mailing list