[Haskell-cafe] Estimating the time to garbage collect

Neil Davies semanticphilosopher at googlemail.com
Mon May 4 10:05:05 EDT 2009


Duncan

That was my first thought - but what I'm looking for is some  
confirmation from those who know better that treating the GC as  
'statistical source' is a valid hypothesis. If the thing is 'random'  
that's fine - if its timing is non-deterministic, that's not fine.

So GC experts are there any hints you can give me? - are there any  
papers that cover this timing aspect? and are there any corner cases  
that might make the statistical approach risky? (or at worse invalid).

I don't want to have to build a stochastic model of the GC, if I can  
help it!

Neil



On 4 May 2009, at 12:51, Duncan Coutts wrote:

> On Fri, 2009-05-01 at 09:14 +0100, Neil Davies wrote:
>> Hi
>>
>> With the discussion on threads and priority, and given that (in
>> Stats.c) there are lots of useful pieces of information that the run
>> time system is collecting, some of which is already visible (like the
>> total amount of memory mutated) and it is easy to make other measures
>> available - it has raised this question in my mind:
>>
>> Given that you have access to that information (the stuff that comes
>> out at the end of a run if you use +RTS -S) is it possible to  
>> estimate
>> the time a GC will take before asking for one?
>>
>> Ignoring, at least for the moment, all the issues of paging,  
>> processor
>> cache occupancy etc, what are the complexity drivers for the time  
>> to GC?
>>
>> I realise that it is going to depend on things like, volume of data
>> mutated, count of objects mutated, what fraction of them are live etc
>> - and even if it turns out that these things are very program  
>> specific
>> then I have a follow-on question - what properties do you need from
>> your program to be able to construct a viable estimate of GC time  
>> from
>> a past history of such garbage collections?
>
> Would looking at statistics suffice? Treat it mostly as a black box.
> Measure all the info you can before and after each GC and then use
> statistical methods to look for correlations to see if any set of
> variables predicts GC time.
>
> Duncan
>



More information about the Haskell-Cafe mailing list