<div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">From: Brandon Moore <<a href="mailto:brandon_m_moore@yahoo.com">brandon_m_moore@yahoo.com</a>><br>
<br>
<br>
I was worried data sharing might mean your keys<br>
retain entire 64K chunks of the input. However, it<br>
seems enumLines depends on the StringLike ByteString<br>
instance, which just converts to and from String.<br>
That can't be efficient, but I suppose it avoids excessive sharing.</blockquote><div><br></div><div>That's true for 'enumLines', however the OP is using 'enumLinesBS', which operates on bytestrings directly.</div>
<div><br></div><div>Data sharing certainly could be an issue here. I tried performing Data.ByteString.copy before inserting the key into the map, but that used more memory. I don't have an explanation for this; it's not what I would expect.</div>
<div><br></div><div>The other parameter which affects sharing is the chunk size. I got a much better memory profile when using a chunksize of 1024 instead of 65536.</div><div><br></div><div>Oddly enough, when using the large chunksize I saw lower memory usage from Data.Map, but with the small chunksize Data.HashMap has a significant advantage.</div>
<div><br></div><div>John Lato</div></div>