[Haskell-cafe] Encoding-aware System.Directory functions

Alistair Bayley alistair at abayley.org
Wed Mar 30 11:01:53 CEST 2011


On 30 March 2011 20:53, Max Bolingbroke <batterseapower at hotmail.com> wrote:

> On 30 March 2011 07:52, Michael Snoyman <michael at snoyman.com> wrote:
> > I could
> > manually do something like (utf8Decode . S8.pack), but that presumes
> > that the character encoding on the system in question is UTF8. So two
> > questions:
>
> Funnily enough I have been thinking about this quite hard recently,
> and the situation is kind of a mess and short of implementing PEP383
> (http://www.python.org/dev/peps/pep-0383/) in GHC I can't see how to
> make it easier on the programmer. As Jason points out the best you can
> really do is probably:
>
>  1. Treat Strings that represent filenames as raw byte sequences, even
> though they claim to be strings
>
>  2. When presenting such Strings to the user, re-decode them by using
> the current locale encoding (which will typically be UTF-8). You
> probably want to have some means of avoiding decoding errors here too
> -- ignoring or replacing undecodable bytes -- but presently this is
> not so straightforward. If you happen to be on a system with GNU Iconv
> you can use it's "C//TRANSLIT//IGNORE" encoding to achieve this,
> however.
>


http://www.haskell.org/pipermail/libraries/2009-August/012493.html

I took from this discussion that FilePath really should be a pair of the
actual filename ByteString, and the printable String (decoded from the
ByteString, with encoding specified by the user's locale). The conversion
from ByteString to String (and vice versa) is not guaranteed to be lossless,
so you need to remember both.

Alistair
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/haskell-cafe/attachments/20110330/0ade1ab8/attachment.htm>


More information about the Haskell-Cafe mailing list