<div dir="ltr">Well for one thing, note that allPairs3 produces the result in reverse order:<div><br></div><div><div><font face="courier new, monospace">&gt;&gt;&gt; allPairs1 &quot;abc&quot;</font></div><div><font face="courier new, monospace">[(&#39;a&#39;,&#39;b&#39;),(&#39;a&#39;,&#39;c&#39;),(&#39;b&#39;,&#39;c&#39;)]</font></div>


<div><font face="courier new, monospace">&gt;&gt;&gt; allPairs2 &quot;abc&quot;</font></div><div><font face="courier new, monospace">[(&#39;a&#39;,&#39;b&#39;),(&#39;a&#39;,&#39;c&#39;),(&#39;b&#39;,&#39;c&#39;)]</font></div>


<div><font face="courier new, monospace">&gt;&gt;&gt; allPairs3 &quot;abc&quot;</font></div><div><font face="courier new, monospace">[(&#39;b&#39;,&#39;c&#39;),(&#39;a&#39;,&#39;c&#39;),(&#39;a&#39;,&#39;b&#39;)]</font></div>


</div><div><font face="courier new, monospace"><br></font></div><div><font face="arial, helvetica, sans-serif">allPairs2 uses &quot;guarded recursion&quot; which the optimizer probably likes, although I don&#39;t quite see why that version would use twice as much memory as allPairs3.</font></div>


<div><font face="arial, helvetica, sans-serif"><br></font></div><div><font face="arial, helvetica, sans-serif">See also:</font></div><div><font face="arial, helvetica, sans-serif"><a href="http://www.haskell.org/haskellwiki/Tail_recursion" target="_blank">http://www.haskell.org/haskellwiki/Tail_recursion</a><br>


</font></div><div><br></div><div>Here&#39;s a fun alternative for you to benchmark, using an old trick. I kind of doubt that this one will optimize as nicely as the others, but I am by no means an optimization guru:</div>


<div><br></div><div>allPairsS :: [a] -&gt; [(a, a)]</div><div>allPairsS xs = go xs [] where</div>

<div>  go [] = id</div><div>  go (y:ys) = (map (\a -&gt; (y, a)) ys ++) . go xs</div><div><br></div><div>Further reading:</div><div><a href="http://www.haskell.org/haskellwiki/Difference_list" target="_blank">http://www.haskell.org/haskellwiki/Difference_list</a><br>


</div><div><br></div></div><div class="gmail_extra"><br clear="all"><div>-- Dan Burton</div>

<br><br><div class="gmail_quote">On Tue, Sep 3, 2013 at 3:28 PM, Scott Pakin <span dir="ltr">&lt;<a href="mailto:pakin@lanl.gov" target="_blank">pakin@lanl.gov</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">


I&#39;m a Haskell beginner, and I&#39;m baffled trying to reason about code<br>

performance, at least with GHC.  For a program I&#39;m writing I needed to<br>

find all pairs of elements of a list.  That is, given the list &quot;ABCD&quot;<br>

I wanted to wind up with the list<br>

[(&#39;A&#39;,&#39;B&#39;),(&#39;A&#39;,&#39;C&#39;),(&#39;A&#39;,&#39;D&#39;)<u></u>,(&#39;B&#39;,&#39;C&#39;),(&#39;B&#39;,&#39;D&#39;),(&#39;C&#39;,&#39;D&#39;)<u></u>], where<br>

the order of elements in the resulting list is immaterial, but I don&#39;t<br>

want both (&#39;A&#39;, &#39;B&#39;) and (&#39;B&#39;, &#39;A&#39;), for example.<br>

<br>

My first attempt looked like this:<br>

<br>

    allPairs1 :: [a] -&gt; [(a, a)]<br>

    allPairs1 []     = []<br>

    allPairs1 (x:xs) = map (\a  -&gt; (x, a)) xs ++ allPairs1 xs<br>

<br>

Benchmarking with ghci&#39;s &quot;:set +s&quot; reveals the following performance<br>

(median of three runs shown):<br>

<br>

    $ ghci<br>

    GHCi, version 7.6.3: <a href="http://www.haskell.org/ghc/" target="_blank">http://www.haskell.org/ghc/</a>  :? for help<br>

    Loading package ghc-prim ... linking ... done.<br>

    Loading package integer-gmp ... linking ... done.<br>

    Loading package base ... linking ... done.<br>

    Prelude&gt; :!ghc -c -O2 allpairs.hs<br>

    Prelude&gt; :load allpairs<br>

    Ok, modules loaded: AllPairs.<br>

    Prelude AllPairs&gt; :m +Control.DeepSeq<br>

    Prelude Control.DeepSeq AllPairs&gt; :show modules<br>

    AllPairs         ( allpairs.hs, allpairs.o )<br>

    Prelude Control.DeepSeq AllPairs&gt; :set +s<br>

    Prelude Control.DeepSeq AllPairs&gt; deepseq (allPairs1 [1..10000]) True<br>

    True<br>

    (4.85 secs, 4004173984 bytes)<br>

<br>

A colleague suggested getting rid of the list append as follows:<br>

<br>

    allPairs2 :: [a] -&gt; [(a, a)]<br>

    allPairs2 [] = []<br>

    allPairs2 (x:xs) = allPairs2&#39; x xs xs<br>

      where allPairs2&#39; x []     []     = []<br>

            allPairs2&#39; x (y:ys) zs     = (x,y) : allPairs2&#39; x ys zs<br>

            allPairs2&#39; x []     (z:zs) = allPairs2&#39; z zs zs<br>

<br>

Not surprisingly, that runs faster:<br>

<br>

    Prelude Control.DeepSeq AllPairs&gt; deepseq (allPairs2 [1..10000]) True<br>

    True<br>

    (2.14 secs, <a href="tel:4403686376" value="+14403686376" target="_blank">4403686376</a> bytes)<br>

<br>

I then figured I could do even better by rewriting the code<br>

tail-recursively:<br>

<br>

    allPairs3 :: [a] -&gt; [(a, a)]<br>

    allPairs3 [] = []<br>

    allPairs3 (x:xs) = allPairs3&#39; x xs xs []<br>

      where allPairs3&#39; :: a -&gt; [a] -&gt; [a] -&gt; [(a, a)] -&gt; [(a, a)]<br>

            allPairs3&#39; h (x:xs) ys     result = allPairs3&#39; h xs ys ((h, x):result)<br>

            allPairs3&#39; _ []     (y:ys) result = allPairs3&#39; y ys ys result<br>

            allPairs3&#39; _ []     []     result = result<br>

<br>

This takes half the memory as the above (yay!) but runs substantially<br>

*slower* (and appears to be thrashing my computer&#39;s memory system):<br>

<br>

    Prelude Control.DeepSeq AllPairs&gt; deepseq (allPairs3 [1..10000]) True<br>

    True<br>

    (10.58 secs, <a href="tel:2403820152" value="+12403820152" target="_blank">2403820152</a> bytes)<br>

<br>

What gives?  Why does my tail-recursive implementation run even slower<br>

and presumably produce more garbage than my initial, naive<br>

implementation?  Is there some optimization GHC is performing for<br>

allPairs1 and allPairs2 that it can&#39;t apply to allPairs3?<br>

<br>

Similarly, how should I, as a newcomer who wants to write efficient<br>

Haskell code, reason about whether a code change is likely to improve<br>

rather than degrade performance?  Guidelines like &quot;per-iteration<br>

appends = bad&quot; and &quot;tail recursion = good&quot; are great, but the<br>

preceding data indicate that something subtler is taking precedence<br>

over those guidelines with respect to code performance.<br>

<br>

Thanks in advance,<br>

-- Scott<br>

<br>

______________________________<u></u>_________________<br>

Haskell-Cafe mailing list<br>

<a href="mailto:Haskell-Cafe@haskell.org" target="_blank">Haskell-Cafe@haskell.org</a><br>

<a href="http://www.haskell.org/mailman/listinfo/haskell-cafe" target="_blank">http://www.haskell.org/<u></u>mailman/listinfo/haskell-cafe</a><br>

</blockquote></div><br></div>