[Haskell-cafe] XML (HXML) parsing :: GHC 6.8.3 space leak from 2000

Ketil Malde ketil at malde.org
Thu Sep 18 04:15:18 EDT 2008


Lev Walkin <vlm at lionet.info> writes:

> Recently I had to process some multi-megabyte XML files.

Join the club!  FWIW, I ended up using tagsoup.

> -- %%% There is apparently a space leak here, but I can't find it.
> -- %%% Update 28 Feb 2000: There is a leak, but it's fixed
> -- %%% by a well-known GC implementation technique.

I couldn't get this to work either.  In particular, I think the GC
trick should allow this without leakage:

   breaks p = groupBy (const (not.p))

But instead I implemented it as:

   breaks :: (a -> Bool) -> [a] -> [[a]]
   breaks p (x:xs) = let first = x : takeWhile (not.p) xs
                         rest  = dropWhile (not.p) xs
                     in  rest `par` first : if null rest then [] else breaks p rest
   breaks _ []     = []

With -smp, this doesn't leak.  It's kind of annoying to have to rely
on -smp in a library as the library cannot control how the
applications get linked, but I've found no other solution.

-k
-- 
If I haven't seen further, it is by standing in the footprints of giants


More information about the Haskell-Cafe mailing list