[Haskell-cafe] Re: Unicode workaround for getDirectoryContents under Windows?

Yitzchak Gale gale at sefer.org
Wed Jun 17 08:21:07 EDT 2009


I wrote:
>> I think the most important use cases that should not break are:
>>
>> o open/read/write a FilePath from getArgs
>> o open/read/write a FilePath from getDirectoryContents

Simon Marlow wrote:
> The following cases are currently broken:
>
>  * Calling openFile on a literal Unicode FilePath (note, not
>   ACP-encoded, just Unicode).
>
>  * Reading a Unicode FilePath from a text file and then calling
>   openFile on it
>
> I propose to fix these (on Windows).  It will mean that your second case
> above will be broken, until someone fixes getDirectoryContents.

Why only on Windows?

> I don't know how getArgs fits in here - should we be decoding argv using the
> ACP?

And why not also on Unix? On any platform, the expected behavior should
be that you type a file path at the command line, read it using getArgs,
and open the file using that.

For comparison, Python works that way, even though the variable
is called "argv" there.

The current behavior on Unix of returning, say, UTF-8 encoding characters
in a String as if they were individual Unicode characters, is queer.
Given your fantastic work so far to rid System.IO of those kinds of oddities,
perhaps now is the time to finish the job.

If you think we really need to provide access to the raw argv bytes,
we could add another platform-independent function that does that.

Thanks,
Yitz


More information about the Haskell-Cafe mailing list