Make lines stricter to fix a space leak

Daniel Fischer daniel.is.fischer at web.de
Mon Sep 27 16:18:40 EDT 2010


On Monday 27 September 2010 20:36:36, Daniel Fischer wrote:
> Pity. Would've been nice to fix the leak without changing the semantics

I think we can. :D

lines :: String -> [String]
lines "" = []
lines s  = uncurry (:) $ case break (== '\n') s of
                           (l,s') -> (l, case s' of
                                           [] -> []
                                           _:s'' -> lines s'')

seems to do it. Although the Core looks as though it might leak,

Rec {
NewLines.lines :: GHC.Base.String -> [GHC.Base.String]
GblId
[Arity 1
 NoCafRefs
 Str: DmdType S]
NewLines.lines =
  \ (ds_diq :: [GHC.Types.Char]) ->
    case ds_diq of wild_B1 {
      [] -> GHC.Types.[] @ GHC.Base.String;
      : ipv_siy ipv1_siz ->
        let {
          p_sjz :: ([GHC.Types.Char], [[GHC.Types.Char]])
          LclId
          [Str: DmdType]
          p_sjz =
            case GHC.List.$wbreak @ GHC.Types.Char lvl_rjC wild_B1
            of _ { (# ww1_aj2, ww2_aj3 #) ->
            (ww1_aj2,
             case ww2_aj3 of _ {
               [] -> GHC.Types.[] @ [GHC.Types.Char];
               : _ s''_adi -> NewLines.lines s''_adi
             })
            } } in
        GHC.Types.:
          @ [GHC.Types.Char]
          (case p_sjz of _ { (x_aiM, _) -> x_aiM })
          (case p_sjz of _ { (_, y_aiS) -> y_aiS })  
         -- this looks like the leak Wadler described,
         -- but it behaves differently
    }
end Rec }

, it ran in constant memory in my tests. Allocation behaviour is very close 
to that of the originally proposed version (as far as I can tell identical 
to the current implementation of lines), reported maximum residency usually 
identical, sometimes a few dozen bytes difference in either direction.
Speed seems also indistinguishable (both are faster than the current on 
long lines due to less GC, no measurable difference for short lines).

The strictness properties are identical to those of the current 
implementation.

Would anybody care to check that the above runs in constant memory not only 
on my computer?

If that's verified, I would propose the above strictness-preserving rather 
than my original proposal.

Cheers,
Daniel


More information about the Libraries mailing list