Preventing/handling space leaks

Fergus Henderson fjh at cs.mu.OZ.AU
Mon Dec 8 15:14:44 EST 2003


On 06-Dec-2003, Sven Panne <Sven.Panne at aedion.de> wrote:
> Henk-Jan.van.Tuyl wrote:
> >[...] it looks to me, that the problem of space leaks is a very good reason
> >to not use Haskell for commercial applications. Java, for example, does 
> >not have this problem.
> 
> I just can't resist when I read PR statements like this (SUN's marketing 
> department has *really* done a good job):

Yes, it is just plain wrong to say that Java never has space leaks.

> Granted, Haskell has problems with space leaks
> from time to time, and it is especially easy for
> beginners to stumble over them,

The problem with Haskell is not so much that beginners
sometimes stuble on space leaks -- the problem is that
even seasoned experts have great difficulty analyzing
the space usage of very simple Haskell programs.

> But for large realistic programs most programming languages converge
> and you basically have the choice of what kind of space leak you want:

If you are suggesting that space leaks are equally frequent and
equally easy to diagnose and avoid in these different languages,
then I would disagree.

>    * C: Missing calls to free(), etc.

For C, leaks are common because it is easy to forget to insert
calls to free(), and avoiding or fixing them can be difficult
because figuring out when it is safe to call free() requires
a non-local analysis to figure out when data is no longer used.

>    * C++: All of C's leaks + lots of hard to find space leaks due to 
>    incorrectly handled exceptions + ...

C does suffer from many of the same problems as C.  But in C++, it is
much easier to automate techniques like reference counting, which can
be done manually in C but are much more cumbersome and error-prone when
done manually.

>    * Haskell: Functions which are not strict enough, thunks which are never 
>    evaluated but hold large data structures, etc.

Yes.  The difficulty with Haskell is that everything is lazy by default.
There is no explicit syntax required to define a lazy function, so
if you want to figure out which functions are too lazy, you may need
to examine *every* function.  There is often no explicit syntax for
creating a thunk, so again it is difficult to spot which parts of the
program are doing this.

>    * Java: Listeners which are not de-registered, containers which are not 
>    "nulled" after removal of an element,

In general Java can suffer from space leaks if variables which will not be
used are not "nulled" out.  However, avoiding and/or fixing such problems
is a lot easier in Java than in C, since determining whether a variable
can safely be nulled out only requires analysing uses of that variable,
rather than analysing uses of the object to which that variable points.
This is a _much_ easier kind of analysis.  IMHO it is probably also
much easier than trying to analyze space usage of Haskell programs.

>    badly written cache-like data structures,  etc.

Those can be a problem in any language.

-- 
Fergus Henderson <fjh at cs.mu.oz.au>  |  "I have always known that the pursuit
The University of Melbourne         |  of excellence is a lethal habit"
WWW: <http://www.cs.mu.oz.au/~fjh>  |     -- the last words of T. S. Garp.


More information about the Haskell-Cafe mailing list