<p dir="ltr">Note also that there are fairly clear reasons that fusion is flakier than many other optimizations. In particular, it requires the compiler to do things that seem *weird*, and that in most other cases are just *bad ideas*. At least, it requires:</p>

<p dir="ltr">1. Inlining things that look large and/or expensive. This is done in the hope that they will fuse with other things, producing a whole that is smaller and cheaper than the sum of its parts. It is done with the knowledge that these things will (in most cases) be written back to smaller cheaper forms if they don't fuse. That is, *our* knowledge of that—the compiler has absolutely no way to know what shenanigans we're up to.</p>

<p dir="ltr">2. Refraining from floating things that look like they should be floated. GHC likes to pull constants and such out because doing so improves sharing and also improves other analyses. But if it pulls our producer away from our consumer, they will not fuse. I think Simon's simplifier changes a few months ago helped with this issue, but I don't know that it is (or can ever be) resolved completely.</p>

<div class="gmail_quote">On Nov 11, 2014 11:54 AM, "David Feuer" <<a href="mailto:david.feuer@gmail.com">david.feuer@gmail.com</a>> wrote:<br type="attribution"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><p dir="ltr"><br>

On Nov 11, 2014 6:04 AM, "Simon Peyton Jones" <<a href="mailto:simonpj@microsoft.com" target="_blank">simonpj@microsoft.com</a>> wrote:<br>

> It’s true that, particularly for fusion, inlining can make a huge difference.  And GHC really does need help… it’s extremely hard for it to make the “right” choice all the time.</p>

<p dir="ltr">The inliner does indeed do amazing things, and list fusion does indeed do lovely things for user code. It's just not the most *reliable* optimization in the compiler. I don't think there's anything wrong with admitting that and trying to avoid relying on it too heavily in *library* code. Kim-Ee was right that expanding out mapM by hand bloated the source. I've since defined `sequence=mapM id` to resolve that problem, and doing so does not hurt the benchmarks—it relies only on inlining id (which is quite reliable) and beta-reducing (which is also quite reliable).</p>

<p dir="ltr">> I strongly agree with Kim-Ee that we should play the game of “optimise by randomly mutating the program and pick the version that (today) happens to run faster”.  But I don’t think David is doing that.   There is, at least a Note: [List comprehensions and inlining].</p>

<p dir="ltr">I've been trying to leave a trail of comments and notes as I go. I may need to go further.</p>

</blockquote></div>