Reducing the cost of garbage collecting small arrays

Simon Marlow marlowsd at gmail.com
Fri Jun 24 10:41:33 CEST 2011


On 23/06/2011 09:54, Johan Tibell wrote:
> On Thu, Jun 23, 2011 at 8:27 AM, Johan Tibell<johan.tibell at gmail.com>  wrote:
>>> Is 5 the optimal number of bits to slice off at a time (ie the best
>>> fanout)?  It sounds like node copy cost on insert argues for a
>>> slightly narrower fanout.  You'll be evacuating / scanning more words
>>> total, but new nodes may equate to less scanning overall (especially
>>> if this is running long enough to have some nodes get tenure).
>>
>> I'm experimenting with this. 6 is far too much, making inserts 4-5x
>> slower. 4 doesn't seem to improve things much (which is a bit
>> counter-intuitive given that 6 made things so much work), but I need
>> to experiment some more.
>
> After some more testing I can improve the performance of insert by 30%
> by reducing the array size from 32 to 16. GC still dominates though.

I will try to look at this when I have some time.  It could just be the 
cost of copying the extra data, though.

This is probably another one of those cases where generational GC isn't 
helping, long-lived data is getting copied multiple times as it 
progresses from the young generation to the old.  I want to look into 
some heuristics to try to detect when this is happening and vary the 
nursery size automatically.

Cheers,
	Simon



More information about the Glasgow-haskell-users mailing list