[Haskell-cafe] Re: can there be (hash-table using) O(n) version of this (I think currently) n log n algo?

Heinrich Apfelmus apfelmus at quantentunnel.de
Sun Jul 19 04:26:14 EDT 2009

Thomas Hartman wrote:
> The code below is, I think, n log n, a few seconds on a million + element list.
> I wonder if it's possible to get this down to O(N) by using a
> hashtable implemementation, or other better data structure.
> -- findsums locates pairs of integers in a list
> that add up to a wanted sum.
> findsums :: [Int] -> Int -> S.Set (Int,Int)
> findsums xs wanted = snd . foldl' f (S.empty,S.empty) $ xs
>   where f (candidates,successes) next =
>      if  S.member (wanted-next) candidates
>        then (candidates, S.insert (next,wanted-next) successes)
>        else (S.insert next candidates,successes)

Remember that hash tables are actually O(key length) instead of O(1), so
I don't think you can break the  log n  for really large lists this
way since the key length increases as well (unless most elements are
equal anyway).

In any case,

I have this conjecture
On which I will lecture:
A Set of some things
Is sorting for kings.


    findsums goal = uncurry sweep . (id &&& reverse) . sort
        sweep []     _      = []
        sweep _      []     = []
        sweep (x:xs) (y:ys) = if x > y then [] else
            case compare (x+y) goal of
                LT -> sweep xs (y:ys)
                EQ -> (x,y) : sweep xs ys
                GT -> sweep (x:xs) ys

This algorithm needs a proof of correctness, though. And it's longer
that the Data.Set version, too.



More information about the Haskell-Cafe mailing list