[Haskell-cafe] Encoding-aware System.Directory functions

Max Bolingbroke batterseapower at hotmail.com
Wed Mar 30 09:53:59 CEST 2011


On 30 March 2011 07:52, Michael Snoyman <michael at snoyman.com> wrote:
> I could
> manually do something like (utf8Decode . S8.pack), but that presumes
> that the character encoding on the system in question is UTF8. So two
> questions:

Funnily enough I have been thinking about this quite hard recently,
and the situation is kind of a mess and short of implementing PEP383
(http://www.python.org/dev/peps/pep-0383/) in GHC I can't see how to
make it easier on the programmer. As Jason points out the best you can
really do is probably:

 1. Treat Strings that represent filenames as raw byte sequences, even
though they claim to be strings

 2. When presenting such Strings to the user, re-decode them by using
the current locale encoding (which will typically be UTF-8). You
probably want to have some means of avoiding decoding errors here too
-- ignoring or replacing undecodable bytes -- but presently this is
not so straightforward. If you happen to be on a system with GNU Iconv
you can use it's "C//TRANSLIT//IGNORE" encoding to achieve this,
however.

Cheers,
Max



More information about the Haskell-Cafe mailing list