[Haskell-cafe] lookup tables & style guidelines

Sat Apr 26 17:03:37 EDT 2008

On Apr 26, 2008, at 7:41 AM, Adrian Hey wrote:

> Jan-Willem Maessen wrote:
>> On Apr 24, 2008, at 11:33 AM, Adrian Hey wrote:
>>> Also, if you're likely to be using union/intersection a lot you  
>>> should
>>> know that Data.Map/Set are very slow for this because they use the
>>> not efficient hedge algorithm :-)
>> OK, I'm going to bite here: What's the efficient algorithm for  
>> union on balanced trees, given that hedge union was chosen as being  
>> more efficient than naive alternatives (split and merge, repeated  
>> insertion)?  My going hypothesis has been "hedge union is an  
>> inefficient algorithm, except that it's better than all those other  
>> inefficient algorithms".
>
> Divide and conquer seems to be the most efficient, though not the
> algorithm presented in the Adams paper.

Just to clarify: divide and conquer splits one tree on the root value  
of the other (possibly avoiding enforcing the balance metric until  
after joining trees, though not obvious how / if that's useful)?  The  
definition of "divide and conquer" on trees without a fixed structure  
is rather unclear, which is why the question comes up in the first  
place.

> Hedge algorithm performs many
> more comparisons than are needed, which is obviously bad if you don't
> know how expensive those comparisons are going to be.

That makes sense.  I found myself having the same kinds of thoughts  
when reading Knuth's analyses of (eg) different binary search  
algorithms in TAOCP v.3; if comparison was the dominant cost, which  
algorithm looked best suddenly changed.

-Jan