Proposal #2629: Data.List: Replace nub; add nubOrd, nubInt, nubWith

Bart Massey bart at cs.pdx.edu
Sun Sep 28 03:49:07 EDT 2008


http://hackage.haskell.org/trac/ghc/ticket/2629

Everyone always complains about nub, but nobody ever does
anything about it except (map head . group . sort), which
isn't lazy and isn't always faster. :-)

I've implemented a new function nubWith that takes a
"stop list" as an argument and filters its target list
against the stop list.  I've then re-implemented nub and
implemented nubOrd and nubInt in terms of nubWith: the stop
list is a typeclass, so these implementations are trivial
and new implementations are easily added.  nubBy is left
alone, since there's nothing obvious to be done about it.
All of the nubs are still fully lazy.

Basic QuickCheck tests are provided, and pass.

Performance benchmarking code is provided.  The performance
of my nub implementation is quite comparable to that of
the standard one.  My nubOrd and nubInt implementations
are dramatically faster, since they use a Set and IntSet
respectively for the stop list.  In particular, they
are performant on long lists with long nubs, unlike the
basic nub.

My implementation is available via git at
  git://svcs.cs.pdx.edu/git/nub.git
or can be browsed at
  http://svcs.cs.pdx.edu/gitweb?p=nub.git;a=tree
and has a maybe-outdated tarball at
  http://svcs.cs.pdx.edu/haskell/nub.tar.gz
The Nub.hs file itself is attached to the proposal.
If the proposal is accepted, I will prepare a patch
against current GHC library top-of-tree, but for now it
seems easier for everyone to just look at the bits in
their current natural habitat.




More information about the Libraries mailing list