select and selectSplit

Fri Feb 15 07:07:21 EST 2008

Johannes Waldmann wrote:
> Chris Kuklewicz wrote:
> 
>> As mentioned in another reply, the select treats the source list like an
>> unordered (perhaps multi) set.
> 
> See - the proposed function is in the "wrong" module;
> it uses a concrete data type (List)
> which normally represents an abstract Sequence type
> but here some abstract data type (Bag) is intended.
> (my "favourite Haskell code smell": list obsession :-)
> 
> Why then, are there so many of these functions?

Lists are the right thing for any small collection (scrabble hands only have 7 
tiles at most).  Sorted lists make great implementations of sets if you do not 
need the O(log n) member testing: union, intersection, difference are all 
trivial in O(n+m) time (and even findMin/deleteMin,minView; use Data.Seq instead 
of Data.List if you also need *max*).

A list of [(item,count)] is then great model of multisets; this is what my 
scrabble program uses.  I never need member testing -- I do need the 'select' 
operation, which got called pullTile in my code.  Data.Set|Map would have been 
O(log 7) instead of O(7) in building the "remaining hand", but this savings is 
swamped by the constant factor when n=7.

> My theory is - because "lists are there" (since LISP)
> and there is (in Haskell) no agreement on those abstract types,
> and in fact implementations (Data.Set) change rather frequently,
> so programmers are hesitant to commit to them.

> 
> But of course we'd only need an agreement on interfaces,
> then you'd not commit to one implementation. Again, Java did this right,
> they have interfaces for Set, Map. They even treat List as interface.

There are several third party libraries (e.g. Edison) that provide such for Haskell.

> Sure, they had fewer design choices as their type system has less freedom,
> just some special form of one-parameter type classes.

The original collections were one-parameter type classes that stored "Object".

Java then had to reinvent Collections using generics. And these are NOT simply 
one-parameter type classes.  The first type parameter is always the first 
argument "this" in Java, which is the collection type (e.g. HashSet).  The 
second type parameter is the type of the item being stored and retrieved (which 
acts sort-of-like a newtype--the storage is the same as "Object" to be 
compatible with old libraries).

> But then, their syntax encourages the use of interface types -
> they are as easy to write as a class type.

Haskell needs MultiParamterTypeClasses to do a great job with a collections 
interface (type of collection and type of item).  This is not Haskell98.  These 
need FunctionalDependencies to be convenient -- which were never fully and 
happily embraced, even as an extensions.  The Associated* extensions which are 
not yet stable will hopefully solve this.

> 
> I'm not actually advocating to introduce more syntactic sugar
> to Haskell (for existential types), I think proper (refactoring)
> tool support would help a great deal already.
> 
> Best regards, Johannes.