[Haskell] Re: [Haskell-cafe] Is Haskell a Good Choice for Web Applications? (ANN: Vocabulink)

Wed May 6 23:20:01 EDT 2009

Jason Dagit wrote:
> On Wed, May 6, 2009 at 3:54 PM, Anton van Straaten
> <anton at appsolutions.com> wrote:
>> FWIW, I have an internal HAppS application that's been running continuously
>> since November last year, used daily, with stable memory usage.
> 
> Do you have advice about the way you wrote you app?  Things you
> knowingly did to avoid space leaks?  Maybe a blog about your HAppS
> app?

The app is written for a client under NDA, so a blog about it would have 
to be annoyingly vague.  But I don't think there's much mystery about 
why it doesn't leak:

The app does simulations.  Each simulation uses at least about 10MB of 
memory, more depending on parameters.  Typically a few thousand 
simulations are run successively, and the results are aggregated and 
analyzed.  The computation itself is purely functional - it takes some 
input parameters and produces results.  The results are written to a 
file.  Since each run of a set of simulations is essentially 
independent, there's not much risk of space leaks persisting across runs.

No doubt the potential for encountering space leaks goes up as one 
writes less pure code, persist more things in memory, and depend on more 
libraries.  My main point in mentioning my app is that "long-running" 
isn't really the issue - that's just a way of saying that an app has 
space leaks that are small enough not to be noticed until it's stressed.

>> In my experience, it's not hard to write stable long-running code
>> in good implementations of languages like Haskell, Scheme, Common Lisp, or
>> Java.
> 
> There are certainly cases where no automatic garbage collector could
> know when it is safe to collect certain things.  

If there are bugs in the user's program, sure - but that still doesn't 
make it "hard" to write applications that don't leak, given a decent GC. 
  On the contrary, I'd say it's very easy, in the great majority of cases.

> A quick google search
> for java space leaks turned up this article:
> http://www.ibm.com/developerworks/java/library/j-leaks/
> 
> I think wikipedia uses the term "logical leak" for the type of space
> leak I'm thinking of.  The garbage collector thinks you care about an
> object but in fact, you want it to be freed.  Yes, it's because of a
> bug, but these are bugs that tend to be subtle and tedious.

The example given in the IBM article is quite typical, but isn't subtle 
at all - it was simply an object being added to a table and never being 
removed.  You can often find such bugs quite easily by searching the 
source tree, without touching a debugging tool.  It's also possible to 
prevent them quite easily, with good coding practices (e.g. centralize 
uses of long-lived tables) and some simple code auditing practices.

If you're dealing with code that's complex enough to involve the kinds 
of non-trivial mutually dependent references that you need in order to 
encounter truly subtle instances of these bugs, the increased difficulty 
of memory management comes with the territory, i.e. it's harder because 
the application is harder.

> The ambiguity is me thinking of relative cost of finding/fixing these
> bugs.  

To put this back into context, I was objecting to your having extended 
the space leak worrying to all GC'd languages.  I'm saying that it isn't 
hard, using most decent language implementations, to avoid space leaks. 
  For trivial cases such as the IBM example, it should be no harder in 
Haskell, either - possibly easier, since use of things like mutable 
tables is more controlled, and may be rarer.

However, Haskell does theoretically introduce a new class of dangers for 
space leaks, I'm not denying that.  Being pure and lazy introduces its 
own set of space leak risks.  But on that front, I was disturbed by the 
vagueness of the claims about long-running apps.  I haven't seen any 
solid justification for scaring people off about writing long-running 
apps in Haskell.  If there is such a justification, it needs to be more 
clearly identified.

> Testing for correctness is something we tend to automate very
> well.

How do you automate testing for performance under load?  Space usage is 
a similar kind of dynamic issue, in general.

> So then, at some point we must have a bag of tricks for dealing with
> these space leaks.  I want to talk about those tricks.  I'm not
> talking about bugs in a specific program, but instead about techniques
> and styles that are known to work well in practice.

OK.  That's a bit different from FFT's original contention, "hard to 
contain a long-running Haskell application in a finite amount of 
memory."  For my own part, I'm at least as non-strict as Haskell, and 
that bag of tricks, for me, is a thunk that hasn't yet been forced.

Anton