Why are strings linked lists?

Simon Marlow simonmar at microsoft.com
Tue Dec 9 09:34:07 EST 2003


 
> > GHC 6.2 (shortly to be released) also supports toUpper, toLower, and
> the
> > character predicates isUpper, isLower etc. on the full Unicode
> character
> > set.
> > 
> > There is one caveat: the implementation is based on the C library's
> > towupper() and so on, so the support is only as good as the 
> C library
> > provides, and it relies on wchar_t being equivalent to Unicode (the
> > sensible choice, but not all libcs do this).
> 
> Now, why would one want to base this on C's wchar_t and its
> "w" routines?

Because it's easy, and it's still an improvement over what we had
before.

> wchar_t is sometimes (isolated) UTF-32 code units,
> including in Linux, sometimes it is (isolated) UTF-16 code units,
> including in Windows, and sometimes something utterly useless.

Yes, I mentioned this above.

> Please instead use ICU's UChar32,

Thanks, I didn't know about ICU - it looks nice.  I did briefly
investigate libunicode, which turned out to be worse than the C library.

Cheers,
	Simon



More information about the Haskell mailing list