[Haskell-cafe] Edison StandardSet has inefficient function implementation

Robert Dockins robdockins at fastmail.fm
Thu Aug 3 08:43:39 EDT 2006


On Aug 3, 2006, at 7:14 AM, Ahn, Ki Yung wrote:

> Edision does not yet have all the asymtotic description of its  
> functions.

Indeed.  This is a big job which requires a lot of time, attention to  
detail, and a pretty good working understanding of lazy amortized  
analysis.  Unfortunately I'm currently lacking in categories 1 and 3...

Many of the bounds can be obtained from the referenced papers.  I'll  
work on adding those as I am able.

> I got the Edision 1.2 source and looked into the code whether the
> container implementations meet the expected asymtotic bounds.

> In the module Data.Coll.StandardSet which packages Data.Set,
> some functions which can be O(log n) is implemented as O(n).
>
> Data.Set has a split and splitMember running in O(log n).
> With those functions we can implement OrdCollX operations like
> filterLT, filterLE, filterGT, filterGE,
> partitionLT_GE, partitionLE_GT, partitionLT_GT all in O(log n).
> However, only partitionLT_GT was O(log n) implemended using split.
> All other function implmentation just used its axiomaic description
> using CollX operations like filter and partition, which is O(n).

Thanks for pointing this out.  filterLT and filterGT can indeed be  
written in terms of "split"; I just missed that somehow.

For the others, however, splitMember won't suffice.  The problem here  
is that splitMember doesn't return the "equal" member from the  
original set, it just returns a Bool indicating whether the set  
contained and "equal" element.  As of now, Edison is supposed to  
guarantee that, for observable collections, you will always get back  
the identical object(s) that you put in.  This accounts for the fact  
that you may supply an Eq instance which is only a weak equivalance,  
that is, even if x == y returns true, x and y may be distinguishable  
in some way.

I am considering dropping this guarantee in a future version of  
Edison, because I think its value is highly dubious.  (Using a set or  
bag with a weak element equivalence is really just creating a finite  
map/relation.  If you need a finite map/relation you should just  
_use_ a finite map/relation!)  If I dropped the guarantee, I could  
indeed implement the other operations as you suggest.



> It needs to be fixed.
>
> P.S. I haven't checked the darcs version yet.
>
> -- 
> Ahn, Ki Yung


Thanks for your comments!



Rob Dockins

Speak softly and drive a Sherman tank.
Laugh hard; it's a long way to the bank.
           -- TMBG





More information about the Haskell-Cafe mailing list