Data.* collections maintenance

Adrian Hey ahey at iee.org
Sat Oct 22 13:36:19 EDT 2005


On Saturday 22 Oct 2005 11:57 am, Adrian Hey wrote:
> 2- The Hedge algorithm seems to be a bit of a problem IMO. The actual
>    benchmark times for AVL union and Data.Set union aren't
>    significantly different for sets of Ints (AVL seemed marginally
>    faster, but nothing to get excited about).
>    But if you count the number of comparisons actually performed
>    Data.Set (Hedge algorithm) is doing about 3..5 times as many
>    comparisons. Presumably this would be reflected in measured
>    execution times in cases where set element comparison was less
>    trivial than comparing Ints.

I've just re-run some old benchmarking code to see what the latest
situation is re. this. There are 3 tests, all with random Ints:
 Test1 : union of two different sets, both of size 200000
 Test2 : union of two different sets, of size 200000 and 200
 Test3 : union of two different sets, of size 200 and 200000.

Relative execution times (in no particular units) are:
        Data.Tree.AVL   Data.Set
Test1:  0.51            0.66
Test2:  1.07e-3         2.35e-3
Test3:  1.07e-3         2.32e-3

So it seems if there's a big difference in set size the difference
in performance isn't so marginal after all. The above tests do not
include comparison counting overhead. If I count the number of
comparisons the same tests give..

        Data.Tree.AVL   Data.Set
Test1:  470668          1562230
Test2:    2449            11336
Test3:    2445            11238

Regards
--
Adrian Hey



More information about the Libraries mailing list