[Haskell-cafe] RE: ANN: System.FilePath 0.9

Tue Jul 18 15:51:48 EDT 2006

On Sun, Jul 16, 2006 at 08:43:31PM -0500, Brian Smith wrote:
> I kind of expect that a Haskell library for file paths will use the
> type system to ensure some useful properties about the paths.

It's a nice idea, but I claim that it's a rat's nest.  Path semantics,
when you look hard at them, are too vague, confusing, and subtle to
encode usefully in types.  And I think there's a better way to do what
you're asking for.

> For example, when writing security-conscious software I often want to be
> able to distinguish between absolute, ascending (paths with leading
> "../" and similar items), and decending paths (paths that contain no
> "../").

My suggestion is to specify your own syntax and semantics for the input
to your software, which I assume is coming in over the network or some
other trust boundary.  By resisting the temptation to piggy-back on
native paths, you control what your paths mean, instead of leaving
it to the system.  Further, for many applications, users don't really
care that their paths map to filesystem paths.  If you keep them
separate, you can even change your storage from the filesystem to
something else.

In your case, either define your path syntax not to allow "..", or
define your own simple normalization rules, and apply them before you
try to combine a user-supplied path with a native system path.  Eg, the
user gives "../a/../b/c/d/..", and you either reject it or turn it into
"/b/c", and then append it to your root, eg "/root/a/b".  Of course, you
might make further restrictions in your paths, like only allowing
letters, etc.

> I want to make sure a filename is valid. For example, "prn" and "con"
> are not valid path elements for normal files on Windows, certain
> characters are not allowed in filenames (depending on platform), some
> platforms may require paths to be escaped in different ways. I see
> there is a "isValid" function and even a (magical) "makeValid"
> function, but they do not report what was wrong with the filename in
> the first place. Furthermore, it isn't clear from the documentation how
> these functions determine whether a filename is valid.

This is another rat's nest, so I suggest that it be dealt with
separately from the basic filepath module.  The notion of "valid" is
squishy:  It depends entirely on what you intend to do with the path.
There are many cases to consider: on Linux, which characters are allowed
depends on the filesystem type, and "special" files may appear anywhere
and have any name--the only way to test for them is by doing IO.  Oh,
and who knows if the situation might change between when you call
isValid and when you actually perform the operation?

> IMO, safety is the most important issue regarding file paths and it is
> not addressed in this library as far as I can see. Writing code to
> handle these issues is tedious, error-prone, and boring to write
> despite being critical. It isn't the kind of code that you want to just
> download off of some guy's webpage. Basically, it is exactly the type
> of thing that belongs in a standard library.

My approach is not to take a filepath and say, "is it safe?" (which
can't be meaningfully answered in general anyway), but to construct
paths in a careful manner that is safe for your application.

> In this library proposal, there are a bunch of "xxxDrive" functions
> that many Unix-oriented programmers are sure to ignore because they are
> no-ops on Unixy systems. Even on Windows, they are not very useful:

I strongly agree about this.  The temptation in path modules seems to be
to throw in everything you can think of (without specifying any of it
precisely), just in case someone finds it useful.  I posted a more
minimalist module a while back:

http://haskell.org/pipermail/libraries/2006-February/004890.html

I tried to export a minimal set of operations that seem to me sufficient
for everything not very platform-specific (though I am interested in
counterexamples):

    currentPath :: p
    prefixes :: p -> [(p, ChildName)]
    addChild :: Monad m => p -> ChildName -> m p
    append :: Monad m => p -> p -> m p
    getChildren :: p -> IO [p]
    canonicalize :: p -> IO p

See the referenced message for explanation.

Andrew