Is it possible INLINE didn't inline the function because it's recursive? If it were my function, I'd probably try a manual worker /wrapper.<br><br><div class="gmail_quote">On 07:59, Wed, Dec 17, 2014 Simon Peyton Jones <<a href="mailto:simonpj@microsoft.com">simonpj@microsoft.com</a>> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">I still would like to understand why INLINE does not make it inline. That's weird.<br>
<br>
Eg way to reproduce.<br>
<br>
Simion<br>
<br>
| -----Original Message-----<br>
| From: Richard Eisenberg [mailto:<a href="mailto:eir@cis.upenn.edu" target="_blank">eir@cis.upenn.edu</a>]<br>
| Sent: 17 December 2014 15:56<br>
| To: Simon Peyton Jones<br>
| Cc: Joachim Breitner; <a href="mailto:ghc-devs@haskell.org" target="_blank">ghc-devs@haskell.org</a><br>
| Subject: Re: performance regressions<br>
|<br>
| By unsubstantiated guess is that INLINEABLE would have the same effect<br>
| as INLINE here, as GHC doesn't see fit to actually inline the<br>
| function, even with INLINE -- the big improvement seen between (1) and<br>
| (2) is actually specialization, not inlining. The jump from (2) to (3)<br>
| is actual inlining. Thus, it seems that GHC's heuristics for inlining<br>
| aren't working out for the best here.<br>
|<br>
| I've pushed my changes, though I agree with Simon that more research<br>
| may uncover even more improvements here. I didn't focus on the number<br>
| of calls because that number didn't regress. Will look into this soon.<br>
|<br>
| Richard<br>
|<br>
| On Dec 17, 2014, at 4:15 AM, Simon Peyton Jones<br>
| <<a href="mailto:simonpj@microsoft.com" target="_blank">simonpj@microsoft.com</a>> wrote:<br>
|<br>
| > If you use INLINEABLE, that should make the function specialisable<br>
| to a particular monad, even if it's in a different module. You<br>
| shouldn't need INLINE for that.<br>
| ><br>
| > I don't understand the difference between cases (2) and (3).<br>
| ><br>
| > I am still suspicious of why there are so many calls to this one<br>
| function that it, alone, is allocating a significant proportion of<br>
| compilation of the entire run of GHC. Are you sure there isn't an<br>
| algorithmic improvement to be had, to simply reduce the number of<br>
| calls?<br>
| ><br>
| > Simon<br>
| ><br>
| > | -----Original Message-----<br>
| > | From: ghc-devs [mailto:<a href="mailto:ghc-devs-bounces@haskell.org" target="_blank">ghc-devs-bounces@<u></u>haskell.org</a>] On Behalf Of<br>
| > | Richard Eisenberg<br>
| > | Sent: 16 December 2014 21:46<br>
| > | To: Joachim Breitner<br>
| > | Cc: <a href="mailto:ghc-devs@haskell.org" target="_blank">ghc-devs@haskell.org</a><br>
| > | Subject: Re: performance regressions<br>
| > |<br>
| > | I've learned several very interesting things in this analysis.<br>
| > |<br>
| > | - Inlining polymorphic methods is very important. Here are some<br>
| > | data points to back up that claim:<br>
| > | * Original implementation using zipWithAndUnzipM:<br>
| 8,472,613,440<br>
| > | bytes allocated in the heap<br>
| > | * Adding {-# INLINE #-} to the definition thereof:<br>
| 6,639,253,488<br>
| > | bytes allocated in the heap<br>
| > | * Using `inline` at call site to force inlining:<br>
| 6,281,539,792<br>
| > | bytes allocated in the heap<br>
| > |<br>
| > | The middle step above allowed GHC to specialize zipWithAndUnzipM<br>
| to<br>
| > | my particular monad, but GHC didn't see fit to actually inline<br>
| the<br>
| > | function. Using `inline` forced it, to good effect. (I did not<br>
| > | collect data on code sizes, but it wouldn't be hard to.)<br>
| > |<br>
| > | By comparison:<br>
| > | * Hand-written recursion: 6,587,809,112 bytes allocated in<br>
| the<br>
| > | heap<br>
| > | Interestingly, this is *not* the best result!<br>
| > |<br>
| > | Conclusion: We should probably add INLINE pragmas to Util and<br>
| > | MonadUtils.<br>
| > |<br>
| > |<br>
| > | - I then looked at rejiggering the algorithm to keep the common<br>
| > | case fast. This had a side effect of changing the<br>
| zipWithAndUnzipM<br>
| > | to mapAndUnzipM, from Control.Monad. To my surprise, this brought<br>
| > | disaster!<br>
| > | * Using `inline` and mapAndUnzipM: 7,463,047,432 bytes<br>
| > | allocated in the heap<br>
| > | * Hand-written recursion: 5,848,602,848 bytes<br>
| > | allocated in the heap<br>
| > |<br>
| > | That last number is better than the numbers above because of the<br>
| > | algorithm streamlining. But, the inadequacy of mapAndUnzipM<br>
| > | surprised me -- it already has an INLINE pragma in Control.Monad<br>
| of course.<br>
| > | Looking at -ddump-simpl, it seems that mapAndUnzipM was indeed<br>
| > | getting inlined, but a call to `map` remained, perhaps causing<br>
| > | extra allocation.<br>
| > |<br>
| > | Conclusion: We should examine the implementation of mapAndUnzipM<br>
| > | (and similar functions) in Control.Monad. Is it as fast as<br>
| possible?<br>
| > |<br>
| > |<br>
| > |<br>
| > | In the end, I was unable to bring the allocation numbers down to<br>
| > | where they were before my work. This is because the flattener now<br>
| > | deals in roles. Most of its behavior is the same between nominal<br>
| > | and representational roles, so it seems silly (though very<br>
| > | possible) to specialize the code to nominal to keep that path<br>
| fast.<br>
| > | Instead, I identified one key spot and made that go fast.<br>
| > |<br>
| > | Thus, there is a 7% bump to memory usage on very-type-family-<br>
| heavy<br>
| > | code, compared to before my commit on Friday. (On more ordinary<br>
| > | code, there is no noticeable change.)<br>
| > |<br>
| > | Validating my patch locally now; will push when that's done.<br>
| > |<br>
| > | Thanks,<br>
| > | Richard<br>
| > |<br>
| > | On Dec 16, 2014, at 10:41 AM, Joachim Breitner <mail@joachim-<br>
| > | <a href="http://breitner.de" target="_blank">breitner.de</a>> wrote:<br>
| > |<br>
| > | > Hi,<br>
| > | ><br>
| > | ><br>
| > | > Am Dienstag, den 16.12.2014, 09:59 -0500 schrieb Richard<br>
| Eisenberg:<br>
| > | >> On Dec 16, 2014, at 4:01 AM, Joachim Breitner <mail@joachim-<br>
| > | <a href="http://breitner.de" target="_blank">breitner.de</a>> wrote:<br>
| > | >><br>
| > | >>> another guess (without looking at the code, sorry): Are they<br>
| in<br>
| > | the >>> same module? I.e., can GHC specialize the code to your<br>
| > | particular Monad?<br>
| > | ><br>
| > | >> No, they're not in the same module. I could also try moving<br>
| the<br>
| > | >> zipWithAndUnzipM function to the same module, and even<br>
| > | specializing >> it by hand to the right monad.<br>
| > | ><br>
| > | > I did mean zipWithAndUnzipM, so maybe yes: Try that.<br>
| > | ><br>
| > | > (I find it hard to believe that any polymorphic monadic code<br>
| > | should > perform well, with those many calls to an unknown (>>=)<br>
| > | with a > function parameter, but maybe I'm too pessimistic here.)<br>
| > | > > >> Could that be preventing the fusing?<br>
| > | ><br>
| > | > There is not going to be any fusing here, at least not list<br>
| > | fusion; > that would require your code to be written in terms of<br>
| > | functions with > fusion rules.<br>
| > | ><br>
| > | > Greetings,<br>
| > | > Joachim<br>
| > | ><br>
| > | > --<br>
| > | > Joachim "nomeata" Breitner<br>
| > | > <a href="mailto:mail@joachim-breitner.de" target="_blank">mail@joachim-breitner.de</a> * <a href="http://www.joachim-breitner.de/" target="_blank">http://www.joachim-breitner.<u></u>de/</a> ><br>
| > | Jabber: <a href="mailto:nomeata@joachim-breitner.de" target="_blank">nomeata@joachim-breitner.de</a> * GPG-Key: 0xF0FBF51F Debian<br>
| > | > Developer: <a href="mailto:nomeata@debian.org" target="_blank">nomeata@debian.org</a> > ><br>
| > | ______________________________<u></u>_________________<br>
| > | > ghc-devs mailing list<br>
| > | > <a href="mailto:ghc-devs@haskell.org" target="_blank">ghc-devs@haskell.org</a><br>
| > | > <a href="http://www.haskell.org/mailman/listinfo/ghc-devs" target="_blank">http://www.haskell.org/<u></u>mailman/listinfo/ghc-devs</a><br>
| > |<br>
| > | ______________________________<u></u>_________________<br>
| > | ghc-devs mailing list<br>
| > | <a href="mailto:ghc-devs@haskell.org" target="_blank">ghc-devs@haskell.org</a><br>
| > | <a href="http://www.haskell.org/mailman/listinfo/ghc-devs" target="_blank">http://www.haskell.org/<u></u>mailman/listinfo/ghc-devs</a><br>
| ><br>
<br>
______________________________<u></u>_________________<br>
ghc-devs mailing list<br>
<a href="mailto:ghc-devs@haskell.org" target="_blank">ghc-devs@haskell.org</a><br>
<a href="http://www.haskell.org/mailman/listinfo/ghc-devs" target="_blank">http://www.haskell.org/<u></u>mailman/listinfo/ghc-devs</a><br>
</blockquote></div>