FilePath as ADT

Axel Simon A.Simon at kent.ac.uk
Fri Feb 3 07:06:28 EST 2006


I think this is not yet discussed on the wiki:

>From the recent post to the Haskell list:

-------- Forwarded Message --------
From: Krasimir Angelov <kr.angelov at gmail.com>
To: haskell <haskell at haskell.org>
Subject: [Haskell] System.FilePath survey

[..]
    * Will you be happy with a library that represents the file path
as String? The opposite is to use ADT for it. The disadvantage is that
with the current IO library we should convert from ADT to String and
back again each time when we have to do any IO. The ADT may have
advantages for the internal library implementation.
[..]
Cheers,
  Krasimir
_______________________________________________

The chance to change the libraries is a chance to get the FilePath type
right. We had a thorough discussion on this already, I think there was
a silent consensus that the FilePath must be an abstract data type due to
Unicode reasons. The discussion back then evolved around the following:

The task: Remove all files in a directory recursively.

The problem: In case the current encoding is UTF-8, filenames stored in
a different locale can comprise illegal UTF-8 sequences and are
therefore not representable as FilePath which is a Unicode string. Even
if the resulting Unicode sting is not 'error ".."', it is impossible to
call 'delete' on that file name, since fromUTF8 . toUTF8 cannot be the
identity function if the UTF8 byte sequence is illegal.

The solution: FilePath must be an abstract data type that is a sequence
of bytes. Programmers should only convert these to Unicode for
displaying them and otherwise treat them as opaque entities. In case of
invalid UTF-8 strings, the corresponding String will have an "invalid
unicode code character" substituted.

The solution of representing a file name abstractly is also used by the
Java libraries. Are there any objections to changing this?

Axel.





More information about the Haskell-prime mailing list