Daniel,<br><br>Thank you so much for helping me out with this issue!<br><br>Thanks to all the other answers from haskel-cafe members too! <br><br>As a newbie, I am not able to understand why zip and map would make a problem...<br>
<br>Is there any link I could read that could help me to understand why in this case<br>zip and map created a leak? What are some function compositions that should be <br>avoided when doing lazy I/O?<br><br>Regards,<br><br>
Arnoldo <br><br><br><div class="gmail_quote">On Thu, Mar 11, 2010 at 11:46 PM, Daniel Fischer <span dir="ltr"><<a href="mailto:daniel.is.fischer@web.de">daniel.is.fischer@web.de</a>></span> wrote:<br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
Am Donnerstag 11 März 2010 00:24:28 schrieb Daniel Fischer:<br>
<div class="im">> Hmm, offhand, I don't see why that isn't strict enough.<br>
<br>
</div>Turns out, mapM_ was a red herring. The villain was (zip and map).<br>
I must confess, I don't know why it sort-of worked without the mapM_,<br>
though. "sort-of", because that also hung on to unnecessarily much memory,<br>
the space leak was just smaller than with the mapM_.<br>
<br>
A very small change that eliminates the space leak, is<br>
<div class="im"><br>
readFasta :: Int -> [Char] -> [Window]<br>
readFasta windowSize sequence =<br>
-- get the header<br>
</div> let (header,rest) = span (/= '\n') sequence<br>
chr = parseChromosome header<br>
go i (w:ws) = Window w chr i : go (i+1) ws<br>
go _ [] = []<br>
in go 0 $ slideWindow windowSize $ filter (/= '\n') rest<br>
<br>
You can improve performance by eliminating slideWindow and the intermediate<br>
Window list (merging fastaExtractor and readFasta),<br>
<br>
{-# LANGUAGE BangPatterns #-}<br>
<br>
readFasta2 :: (String -> Bool) -> Int -> String<br>
readFasta2 test windowSize sequence =<br>
let (header,rest) = span (/= '\n') sequence<br>
chr = parseChromosome header<br>
schr = show chr<br>
go !i st@(_:tl)<br>
| test w = w ++ '\t' : schr ++ '\t' : show i ++ '\n' : go<br>
(i+1) tl<br>
| otherwise = go (i+1) tl<br>
where<br>
w = take windowSize st<br>
go _ [] = []<br>
in go 0 (filter (/= '\n')) rest<br>
<br>
</blockquote></div><br>