[Haskell-cafe] Core packages and locale support

Jason Dagit dagit at codersbase.com
Fri Jun 25 21:36:30 EDT 2010


On Fri, Jun 25, 2010 at 3:15 PM, Brandon S Allbery KF8NH <
allbery at ece.cmu.edu> wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 6/25/10 17:56 , Roman Cheplyaka wrote:
> > * Brandon S Allbery KF8NH <allbery at ece.cmu.edu> [2010-06-25
> 05:00:08-0400]
> >> You might want to look at how Python is dealing with this (including the
> >> pain involved; best to learn from example).
> >
> > Do you mean the pain when filenames can not be decoded using current
> > locale settings and thus the files are not accessible? (The same about
> > environment variables.)
>
> Yes, this.
>
> > Agreed, it's unpleasant. The other way would be changing [Char] to
> [Word8]
> > or ByteString. But this would a) break all existing programs and b) be
> > an OS-specific hack. Crap.
>
> But it *is* OS-specific, just as Windows' UTF-16 is an OS-specific
> mechanism.  Unfortunately, there's no good solution in the Unix case aside
> from assuming a specific encoding, and the locale is as good as any; but I
> think LC_CTYPE is probably the most applicable.  This will, however,
> confuse
> everyone else.
>
> Perhaps best is to look at whether there is any consensus building as to
> how
> to resolve it, and if not use locale but document it as an unstable
> interface.  Or possibly just leave things as is until consensus develops.
> It would be Bad to choose one (say, locale) only to have everyone else go
> in
> a different direction (say, UTF-8 with the application libraries
> potentially
> re-encoding filenames).
>

In the case of IO you can disable the locale specific encoding/decoding by
switching to binary mode.  Would a similar API be available when working
with filepaths?  Darcs, for instance, deals with lots of file paths and has
very specific requirements.  Losing access to files due to bad encodings, or
mistaken encodings, is the sort of thing that would break some people's
repositories.  So tools like Darcs would probably need a way to disable this
sort of automatic encoding/decoding.

Jason
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.haskell.org/pipermail/haskell-cafe/attachments/20100625/398ae63d/attachment.html


More information about the Haskell-Cafe mailing list