From: "Edward Z. Yang" <<a href="mailto:ezyang@MIT.EDU">ezyang@MIT.EDU</a>><br><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
<br>
Hello Aleksandar,<br>
<br>
It is possible that the iteratees library is space leaking; I recall some<br>
recent discussion to this effect. Your example seems simple enough that<br>
you might recompile with a version of iteratees that has -auto-all enabled.<br>
Unfortunately, it's not really a safe bet to assume your libraries are<br>
leak free, and if you've pinpointed it down to a single line, and there<br>
doesn't seem a way to squash the leak, I'd bet it's the library's fault.<br>
<br>
Edward<br></blockquote><div><br>I can't reproduce the space leak here. I tried Aleksander's original code, my iteratee version, the Ngrams version posted by Johan Tibell, and a lazy bytestring version.<br><br>my iteratee version (only f' has changed from Aleksander's code):<br>
<br>f' :: Monad m => I.Iteratee S.ByteString m Wordcounts<br>f' = I.joinI $ (enumLinesBS I.><> I.filter (not . S.null)) $ I.foldl' (\t s -> T.insertWith (+) s 1 t) T.empty<br><br>my lazy bytestring version<br>
<br>> import Data.Iteratee.Char<br>> import Data.List (foldl')import Data.Char (toLower)<br>> <br>> import Data.Ord (comparing)<br>> import Data.List (sortBy)<br>> import System.Environment (getArgs)<br>
> import qualified Data.ByteString.Lazy.Char8 as L<br>
> import qualified Data.HashMap.Strict as T<br>><br>> f'2 = foldl' (\t s -> T.insertWith (+) s 1 t) T.empty . filter (not . L.null) . L.lines<br>><br>> main2 :: IO ()<br>> main2 = getArgs >>= L.readFile .head >>= print . T.keys . f'2<br>
<br></div></div>None of these leak space for me (all compiled with ghc-7.0.3 -O2). Performance was pretty comparable for every version, although Aleksander's original did seem to have a very small edge.<br><br>As someone already pointed out, keep in mind that this will use a lot of memory anyway, unless there's a lot of repetition of words.<br>
<br>I'd be happy to help track down a space leak in iteratee, but for now I'm not seeing one.<br><br>Best,<br>John Lato<br>