Persistent data

Simon Peyton-Jones simonpj@microsoft.com
Tue, 4 Mar 2003 09:26:08 -0000


GHC has a multi-generational garbage collector.  If you have enough
physical memory on your machine so that the GC isn't thrashing trying to
find the 100 free bytes that remain, then you should find the database
migrates to the oldest generation and stays there.  If you use +RTS
-Sstderr you'll see info about when GC happens, and which generation.
There should be lots of young-gen collections for each old-gen one. You
can increase the number of generations with a command-line flag to the
runtime system (see the user manual).  =20

Simon

| -----Original Message-----
| From: Sengan.Baring-Gould@nsc.com [mailto:Sengan.Baring-Gould@nsc.com]
| Sent: 03 March 2003 18:38
| To: haskell@haskell.org
| Subject: Persistent data
|=20
| Is there some way to reduce the cost of garbage collection over large
persistent
| datastructures without resorting to escaping to C to malloc memory
outside the
| heap?
|=20
| The program I'm working is part database, which cannot discard
information.
| The net result is that I see figures like 82.9% of the time taken by
garbage
| collection. The heap profile looks like a charging capacitor: a linear
increase
| (as expected) which is slowly dilated as time increases by the garbage
collector
| thrashing memory.
|=20
| When I worked on the LOLITA natural language processor we solved the
problem
| by moving a lot of the data out to C++, so that the heap only contains
things
| soon to be freed. I know generational garbage collection is supposed
to help,
| but it doesn't seem to. Is there a pure Haskell solution to this
problem?
|=20
| Sengan
|=20
| _______________________________________________
| Haskell mailing list
| Haskell@haskell.org
| http://www.haskell.org/mailman/listinfo/haskell