<br><br><div><span class="gmail_quote">On 9/15/05, <b class="gmail_sendername">Simon Marlow</b> &lt;<a href="mailto:simonmar@microsoft.com">simonmar@microsoft.com</a>&gt; wrote:</span><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

On 15 September 2005 01:04, Karl Grapone wrote:<br><br>&gt; I'm considering using haskell for a system that could, potentially,<br>&gt; need 5GB-10GB of live data.<br>&gt; My intention is to use GHC on Opteron boxes which will give me a max

<br>&gt; of 16GB-32GB of real ram.&nbsp;&nbsp;I gather that GHC is close to being ported<br>&gt; to amd64.<br>&gt;<br>&gt; Is it a realistic goal to operate with a heap size this large in GHC?<br>&gt; The great majority of this data will be very long tenured, so I'm

<br>&gt; hoping that it'll be possible to configure the GC to not need to much<br>&gt; peak memory during the collection phase.<br><br>It'll be a good stress test for the GC, at least.&nbsp;&nbsp;</blockquote><div><br>

Ouch!&nbsp; It scares me when people say that something will be a good stress test! :) <br>

</div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">There are no reasons<br>in principle why you can't have a heap this big, but major collections

<br>are going to take a long time.&nbsp;&nbsp;It sounds like in your case most of this<br>data is effectively static, so in fact a major collection will be of<br>little use.</blockquote><div><br>

You're correct, the system will gradually accrue permanent data.&nbsp;

I forsee there being two distinct generations, a fairly constant sized

short-lived one, and a gradually increasing set of immortal allocations.<br>

Response times will be critical, but hopefully the GC can be tweaked to a sweet spot.<br>

&nbsp;</div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">Generational collection tries to deal with this in an adaptive way:<br>long-lived data gets traversed less and less often as the program runs,

<br>as long as you have enough generations.&nbsp;&nbsp;But if the programmer really<br>knows that a large chunk of data is going to be live for a long time, it<br>would be interesting to see whether this information could be fed back

<br>in a way that the GC can take advantage of it.&nbsp;&nbsp;I'm sure there must be<br>existing techniques for this sort of thing.</blockquote><div>&nbsp;</div>Well,

I would naively say I only need two, maybe three, generations, as any

memory that has been around for more than a matter of a couple of hours

is definitely going to be around until system shutdown.&nbsp; But I'm

completely new to haskell and I don't know if that holds for a lazy

language.&nbsp; My hope was that laziness would allow for better

response times but it certainly seems to muddy the GC waters.<br><div><br>

</div></div>I'd like to recommend haskell, but I just don't know enough to be comfortable yet... more research methinks.<br>

<br>

Thanks for your responses.<br>

Karl<br>