[Haskell-cafe] k-minima in Haskell

Nicolas Frisby nicolas.frisby at gmail.com
Thu Apr 12 03:14:49 EDT 2007


[sorry for the double, ajb]

Since there seemed to be a disconnect between the expectation and
the previous answers, I thought an alternative suggestion might help
out. This sort of thing (haha) usually isn't my cup o' tea, so please
point out any blunders.

RM, is this more like what you had in mind? It leans more towards an
iterative approach. If so, I encourage you to compare this to the
other solutions and try understand how those solutions rely upon
laziness. Stefan and Andrew, feel free to point out the drawbacks to
this approach that I'm probably overlooking.


kminima n l = map fst (foldr insert' (List.sort pre) suf)
    where (pre, suf) = (splitAt n . zip [0..]) l

-- I think this insertion sort could be
-- O(log k) with a better data structure.
insert' x [] = []
insert' x (y : ys) | snd x < snd y = x : y : dropLast ys'
                           | otherwise = y : insert' x ys
    where dropLast ys = take (length ys - 1) ys

We grab the first k elements and sort them, this list is our first
approximation to the k-minima. We then process the rest of the list,
checking if the current element is smaller than any of the minima of
the current approximation. If the current element is smaller, we
improve the current approximation by inserting the new element and
dropping the biggest (i.e. last) minimum from the minima accumulator.

The worst behavior is: sort(k) + (n-k) * k comparisons. This could be
improved (to: sort(k) + (n-k) * log k comparisons, I think) with a
better data structure for the minima approximation.

On 4/12/07, ajb at spamcop.net <ajb at spamcop.net> wrote:
> G'day all.
>
> Quoting raghu vardhan <mrvr84 at yahoo.co.in>:
>
> > So, any algorithm that sorts is no good (think of n as huge, and k small).
>
> The essence of all the answers that you've received is that it doesn't
> matter if k is small, sorting is still the most elegant answer in Haskell.
>
> The reason is that in Haskell, a function which sort function will produce the
> sorted
> list lazily.  If you don't demand the while list, you don't perform
> the whole sort.
>
> Some algorithms are better than others for minimising the amount of
> work if not all of the list is demanded, but ideally, producing the
> first k elements will take O(n log k + k) time.
>
> Cheers,
> Andrew Bromage
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>


More information about the Haskell-Cafe mailing list