<div dir="ltr">I&#39;ve got a Partitionable class that I&#39;ve been using for this purpose:<br><div><br><a href="https://github.com/mikeizbicki/ConstraintKinds/blob/master/src/Control/ConstraintKinds/Partitionable.hs">https://github.com/mikeizbicki/ConstraintKinds/blob/master/src/Control/ConstraintKinds/Partitionable.hs</a><br>

<br></div><div>The function called &quot;parallel&quot; in the HLearn library will automatically parallelize any homomorphism from a Partionable to a Monoid.  I specifically use that to parallelize machine learning algorithms.<br>

</div><div class="gmail_extra"><br></div><div class="gmail_extra">I have two thoughts for better abstractions:<br><br>1)  This Partitionable class is essentially a comonoid.  By reversing the arrows of mappend, we get:<br>

<br></div><div class="gmail_extra">comappend :: a -&gt; (a,a)<br></div><div class="gmail_extra"><br></div><div class="gmail_extra">By itself, this works well if the number of processors you have is a power of two, but it needs some more fanciness to get things balanced properly for other numbers of processors.  I bet there&#39;s another algebraic structure that would capture these other cases, but I&#39;m not sure what it is.<br>

</div><div class="gmail_extra"><br></div><div class="gmail_extra">2) I&#39;m working with parallelizing tree structures right now (kd-trees, cover trees, oct-trees, etc.).  The real problem is not splitting the number of data points equally (this is easy), but splitting the amount of work equally.  Some points take longer to process than others, and this cannot be determined in advance.  Therefore, an equal split of the data points can result in one processor getting 25% of the work load, and the second processor getting 75%.  Some sort of lazy Partitionable class that was aware of processor loads and didn&#39;t split data points until they were needed would be ideal for this scenario.<br>

</div><div class="gmail_extra"><br><div class="gmail_quote">On Sat, Sep 28, 2013 at 6:46 PM, adam vogt <span dir="ltr">&lt;<a href="mailto:vogt.adam@gmail.com" target="_blank">vogt.adam@gmail.com</a>&gt;</span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="im">On Sat, Sep 28, 2013 at 1:09 PM, Ryan Newton &lt;<a href="mailto:rrnewton@gmail.com">rrnewton@gmail.com</a>&gt; wrote:<br>


&gt; Hi all,<br>

&gt;<br>

&gt; We all know and love Data.Foldable and are familiar with left folds and<br>

&gt; right folds.  But what you want in a parallel program is a balanced fold<br>

&gt; over a tree.  Fortunately, many of our datatypes (Sets, Maps) actually ARE<br>

&gt; balanced trees.  Hmm, but how do we expose that?<br>

<br>

</div>Hi Ryan,<br>

<br>

At least for Data.Map, the Foldable instance seems to have a<br>

reasonably balanced fold called fold (or foldMap):<br>

<br>

&gt;  fold t = go t<br>

&gt;    where   go (Bin _ _ v l r) = go l `mappend` (v `mappend` go r)<br>

<br>

This doesn&#39;t seem to be guaranteed though. For example ghc&#39;s derived<br>

instance writes the foldr only, so fold would be right-associated for<br>

a:<br>

<br>

&gt; data T a = B (T a) (T a) | L a deriving (Foldable)<br>

<br>

Regards,<br>

Adam<br>

_______________________________________________<br>

Haskell-Cafe mailing list<br>

<a href="mailto:Haskell-Cafe@haskell.org">Haskell-Cafe@haskell.org</a><br>

<a href="http://www.haskell.org/mailman/listinfo/haskell-cafe" target="_blank">http://www.haskell.org/mailman/listinfo/haskell-cafe</a><br>

</blockquote></div><br></div></div>