[Haskell-cafe] Why isn't there a cheaper "split-in-two" operation for Data.Set?

Ryan Newton rrnewton at gmail.com
Fri Sep 24 11:48:05 EDT 2010


Would there be anything wrong with a Data.Set simply chopping off half its
(balanced) tree and returning two approximately balanced partitions -- i.e.
Data.Set.split without the pivot argument?  Even though it can't quite be
constant-time (still need rebalancing) it could still be cheaper
than Data.Set.split, right?  In the context of Set.hs, I'm thinking along
the lines of something like the following:

cleave :: Set a -> (Set a, Set a)
cleave Tip = (Tip, Tip)
cleave (Bin _ x l r)
  | size l > size r = (l, insertMin x r)
  | otherwise       = (insertMax x l, r)

For one thing, this seems like the proposed function would be useful for
consuming a set in parallel, similar to Guy Steele's arguments in recent
talks (see Organizing Functional Code for Parallel Execution; or, foldl and
foldr Considered Slightly
Harmful<http://research.sun.com/projects/plrg/Publications/ICFPAugust2009Steele.pdf>).
 That is, one splits a tree and uses "par" to process the left and right
half in parallel, enabling

Alternatively, it would be simple to provide a parallel version of
mapMonotonic that does this internally, i.e.:

mapMonotonic _ Tip = Tip
mapMonotonic f (Bin sz x l r) =
    Bin sz (f x) (mapMonotonic f l) (mapMonotonic f r)

Becomes something like:

parMapMonotonic f (Bin sz x l r) =
    par left $ seq right $
    Bin sz (f x) left right
 where
  left  = mapMonotonic f l
  right = mapMonotonic f r

Cheers,
 -Ryan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.haskell.org/pipermail/haskell-cafe/attachments/20100924/d9043743/attachment.html


More information about the Haskell-Cafe mailing list