[Haskell-cafe] Re: (flawed?) benchmark : sort

Dan Doel dan.doel at gmail.com
Sun Mar 9 23:04:01 EDT 2008


On Sunday 09 March 2008, Krzysztof Skrzętnicki wrote:
> Ok, I did some search and found Data.Map, which can be used to implement
> pretty fast sorting:
>
> import qualified Data.Map as Map
>
> treeSort :: Ord a => [a] -> [a]
> treeSort = map (\(x,_) -> x ) . Map.toAscList . Map.fromList . map
> (\x->(x,()))
>
> In fact It is likely to behave like sort, with the exception that it is 23%
> faster. I did not hovever check the memory consumption. It works well on
> random, sorted and reverse-sorted inputs, and the speed difference is
> always about the same. I belive I could take Data.Map and get datatype
> isomorphic to specialized *Data.Map a ()* of it, so that treeSort will
> became Map.toAscList . Map.fromList. This may also bring some speedup.
>
> What do you think about this particular function?

Some thoughts:

1) To get your function specifically, you could just use Data.Set.Set a 
instead of Map a ().

2) What does it do with duplicate elements in the list? I expect it deletes 
them. To avoid this, you'd need to use something like fromListWith, keeping 
track of how many duplicates there are, and expanding at the end.

3) I imagine the time taken to get any output is always O(n*log n). Various 
lazy sorts can be written (and I'd guess the standard library sort is written 
this way, although I don't know for sure) such that 'head (sort l)' is O(n), 
or O(n + k*log n) for getting the first k elements. However, Map, being a 
balanced binary tree, doesn't (I think) have the right characteristics for 
this.

At the very least, you'll probably want to test with a function that doesn't 
delete duplicate elements. Something like this:

  treeSort = concatMap (\(x,n) -> replicate n x)
              . Map.toAscList . Map.fromListWith (+) . map (\x -> (x,1))

-- Dan


More information about the Haskell-Cafe mailing list