[Haskell] Re: Quicksearch vs. lazyness

Sat Apr 14 06:05:52 EDT 2007

apfelmus wrote:
> Steffen Mazanek wrote:
>> From my understanding for small k's lazy
>> evaluation already does the trick for the naive quicksort
>> algorithm (quicksort smaller ++ [x] ++ quicksort greater),
>> doesn't it? Is there a search algorithm that makes better
>> use of lazy evaluation out of the box?
> 
> No, as far as I thought about quicksort, it needs O(n log n) to produce
> the first element. But even the mergesort implementation has to be
> chosen carefully as I've seen many that still need O(n log n) just to
> return the smallest element.

Apparently, I did not think enough about quicksort :) as it is well
capable of delivering the minimum element in O(n) time. In fact, given
proper pivot choice,

  take k . qsort

is the optimal O(n + k log k) algorithm for returning the first k
smallest elements in sorted order! See also

  http://en.wikipedia.org/wiki/Selection_algorithm

Let's assume that the quicksort in

  qsort []     = []
  qsort (x:xs) = qsort ls ++ [x] ++ qsort rs
    where
    ls = filter (<  x) xs
    rs = filter (>= x) xs

always uses a good pivot x, i.e. that ls and rs have around n/2 elements
each. As long as this n/2 is greater than k, taking the first k elements
does not evaluate the second recursive call to quicksort. In other
words, it takes

   O(n)   -- for filtering xs during the initial call to qsort
 + O(n/2) -- same as above but with the first half of the list
 + O(n/4) -- filtering the half of the half
 + ...
 + O(k)
 ________
 < O(n + n/2 + n/4 + n/8 + ...) = O(n)

time until ls has fewer than k elements. When this becomes the case, the
argument list to the recursive call to qsort has a size of at most 2*k
and it takes at most O(k log k) time finish sorting it completely and
taking the first k elements of this. This gives a total of O(n + k log k).

Regards,
apfelmus