[Haskell-cafe] lookup tables & style guidelines

Adrian Hey ahey at iee.org
Sat Apr 26 07:41:40 EDT 2008


Jan-Willem Maessen wrote:
> 
> On Apr 24, 2008, at 11:33 AM, Adrian Hey wrote:
> 
>> Also, if you're likely to be using union/intersection a lot you should
>> know that Data.Map/Set are very slow for this because they use the
>> not efficient hedge algorithm :-)
> 
> OK, I'm going to bite here: What's the efficient algorithm for union on 
> balanced trees, given that hedge union was chosen as being more 
> efficient than naive alternatives (split and merge, repeated 
> insertion)?  My going hypothesis has been "hedge union is an inefficient 
> algorithm, except that it's better than all those other inefficient 
> algorithms".

Divide and conquer seems to be the most efficient, though not the
algorithm presented in the Adams paper. Hedge algorithm performs many
more comparisons than are needed, which is obviously bad if you don't
know how expensive those comparisons are going to be. IIRC it was
something like 4..5 times as many of 2 sets of a million or so random
Ints.

But even in favourable circumstances (tree elements are boxed Ints)
divide and conquer on AVL trees seemed much faster than Hedge on
Data.Set. Of course ideally we would want implementations of Hedge
for AVL and divide and conquer for Data.Set too, but I didn't feel
inclined to write them.

Regards
--
Adrian Hey



More information about the Haskell-Cafe mailing list