[Haskell-beginners] Diagnosing : Large memory usage + low CPU

Hugo Ferreira hmf at inescporto.pt
Wed Nov 30 15:23:53 CET 2011


Hello,

On 11/29/2011 10:57 PM, Stephen Tetley wrote:
> Hi Hugo
>
> What is a POSTags and how big do you expect it to be?
>
>

type Token = String
type Tag = String

type NGramTag = (Token, Tag, Tag)

type POSTags = Z.Zipper NGramTag


> Generally I'd recommend you first try to calculate the size of your
> data rather than try to strictify things, see Johan Tibell's very
> useful posts:
>
>
> http://blog.johantibell.com/2011/06/memory-footprints-of-some-common-data.html
> http://blog.johantibell.com/2011/06/computing-size-of-hashmap.html
>

According to size in String I am expecting a maximum of 50 Mega.
Profiling (after a painful 80 minutes) shows:

total alloc = 20,350,382,592 bytes

Way too much.

> Once you know the size of your data - you can decide if it is too big
> to comfortably work with in memory. If it is too big you need to make
> sure you're are streaming[*] it rather than forcing it into memory.
>
> If POSTags is large, I'd be very concerned about the top line of
> updateState - reversing lists (or sorting them) simply doesn't play
> well with streaming.
>

The zipper does quite a bit of reversing and appending.
I also need to reverse lists to retain the order of the
characters (text). I also do sorting but I have eliminated this
in the tests.

So my question: how can one "force" the reversing and append?
Anyone?

TIA,
Hugo F.


>
> [*] Even in a lazy language like Haskell, streaming data isn't
> necessarily automatic.
>
> _______________________________________________
> Beginners mailing list
> Beginners at haskell.org
> http://www.haskell.org/mailman/listinfo/beginners
>




More information about the Beginners mailing list