[Haskell-cafe] space leak hints?

Sat Jul 4 01:37:08 EDT 2009

Good evening, all, I wonder if I could tap your collective wisdom
regarding space leaks? I've been messing about with haskeem, my little
scheme interpreter, and I decided to see if I could make it run
reasonably space-efficiently. So far... no.

Here's what I tried: I wrote a tiny scheme program to compute Collatz
sequences for successive numbers, starting from 1 and incrementing
forever (well, in principle). Because real scheme implementations are
fully tail-call-optimized, this'll run in constant memory; I checked
that with mzscheme, and it does indeed work. With my little
interpreter, that's not the case: memory usage grows continually,
although apparently less-than-linearly. I've built the interpreter
with the profiling stuff described on the wiki and in RWH Ch 25 turned
on and have made a few runs with that; I stuck the postscript plot
that's the result of one of those runs onto my web site at
http://www.korgwal.com/haskeem/run2.ps.

The full source to the interpreter is a little large to paste into
this message; it's available on my web site, and also on hackage. But
according to the plot, there appear to be three main memory allocation
issues, and they seem to all be related, if I'm reading stuff
correctly. The core of the interpreter is a function, evalLisp, which
evaluates scheme forms. There are of course a fair number of different
forms, but the largest generic usage is "evaluate each of a list of
forms, returning the value of the last of them as the overall result".
In order to express that in a couple of different places, and to
accomodate the possibility of an empty list, I have a really tiny
function lastOrNil which just calls "last" on a (haskell) list,
checking for the possibility of an empty list, and returning a haskeem
LispVal object:

> lastOrNil = liftM lON
>   where lON [] = List []
>        lON l = last l

(sorry, proportional fonting may be throwing this off).

It's this function which is directly implicated in two of the top
three memory pigs, and nearly directly in the third one as well. If I
could eliminate the memory growth in these three cost centers, I would
already capture over 90% of the growth in this benchmark, which would
make me very happy indeed. But I don't really understand what is going
on here. It seems entirely plausible, indeed likely, that the list
which I'm traversing there is not fully evaluated. So I've tried
adding 'seq' to this function. Uhhh... from memory, I dumped it after
it didn't work, I had

> lastOrNil = liftM lON
>   where lON [] = List []
>               lON (l:[]) = l
>               lON (l:ls) = seq l (lON ls)

(again, proportional fonting might mess me up here.)

As I said, I dumped this after it made no difference whatsoever. I
also tried bang-patterns in a couple of places, strictness annotation
of my basic LispVal types... nothing. It all made exactly no
difference, as far as I could tell. I've tried a couple of google
searches, but haven't come up with anything better than what I've
already described. So I'm a stumped chump!

I'd be grateful for any suggestions... thanks in advance!

Uwe