Filename handling

Graham Klyne GK at ninebynine.org
Tue Aug 17 10:49:02 EDT 2004


At 14:04 17/08/04 +0100, Simon Marlow wrote:
>On 17 August 2004 12:44, Graham Klyne wrote:
>
> > Anyway, I'd like to see the common library functions provide at least
> > minimal capabilities to allow multi-platform applications to do the
> > right
> > thing when handling filenames.  I'm pretty agnostic about what they
> > actually look like, but as an example I've found the Python and/or
> > Java
> > libraries to be pretty usable in this respect.
>
>I think there's general agreement that this would be a good thing, but
>discussion never seems to reach a conclusion.  Anyone like to whip up a
>concrete proposal?

This may be a bit radical, but I'll float it anyway:

     pathToUri :: String -> String
     -- convert filename to a file: URI according to local system conventions

     uriToPath :: String -> String
     -- convert a file: URI to filename according to local system conventions

Hmmm... to preserve referential transparency, I suppose that should be:

     pathToUri :: String -> IO String
     uriToPath :: String -> IO String

The rationale here is that these two functions can be used to get any 
filename on any system into a form with well-defined syntax and properties 
and back again, allowing the other filename processing requirements 
(splitting apart, putting together, relative path evaluations, etc.) to be 
performed with the common form.

Of course, this doesn't deal with operations that need to actually access 
the file system (directory scanning, etc.), but many of these seem pretty 
well catered for in any case (cf. Directory library functions).

...

Failing this, I'd say that Isaac's module [1] has some pretty reasonable 
functions.  I'd pick out:
   splitLastComp :: FilePath -> (FilePath,FilePath)
   isAbsolute :: FilePath -> Bool
   splitExt :: FilePath -> (FilePath, String)

The next function would be useful, but I'd be reluctant to include it until 
we're confident of having consistent regex support on all platforms:
matchPath :: String        -- ^RegExp
           -> IO [FilePath] -- ^IO because it must look to see what exists

An alternative, avoiding regex dependence, might be:

matchPath :: (FilePath -> Bool) -> IO [FilePath]

And a very important (IMO) function that I don't see in Isaac's module 
would be something like:

   relativeTo :: FilePath -> FilePath -> FilePath

In my URI processing code, I've also added a complementary function:

   relativeFrom :: FilePath -> FilePath -> FilePath

which returns a relative path such that:

   (path `relativeFrom` base) `relativeTo` base == path

noting that the result relativeFrom is not always uniquely 
determined.  Maybe it's better to leave this out.

I think that a function like:

   isDirectory :: FilePath -> IO Bool

may also be needed when performing directory scanning operations.

[1] http://www.syntaxpolice.org/darcs_repos/OS.Path/Path.hs

...

Some related questions to consider:

- should we take seriously the point I make above about using IO so that 
referential transparency is rigorously preserved?  If so, all of the above 
functions should return IO values, as the result may vary depending on the 
environment in which the program runs.

- do we care about legacy operating systems like VAX/VMS?  (that would 
require version number support, and doesn't work well with interfaces that 
assume a single path separator character).

- how does the interface work with forthcoming systems like Microsoft's 
Longhorn.  I hear that the directory tree concept is being replaced by file 
"attributes".  Which leads me to think of...

- how does the interface work with WebDAV, which builds a file system like 
interface over HTTP, and adds property lists to the resources identified.

#g


------------
Graham Klyne
For email:
http://www.ninebynine.org/#Contact



More information about the Libraries mailing list