[commit: ghc] ghc-7.6: Do calcUnfoldingGuidance on the *non* occ-analysed RHS (1363c32)

Paolo Capriotti p.capriotti at gmail.com
Fri Aug 10 14:32:29 CEST 2012


Repository : ssh://darcs.haskell.org//srv/darcs/ghc

On branch  : ghc-7.6

http://hackage.haskell.org/trac/ghc/changeset/1363c32a0d5b79c846530fa16d09076d02f29d1c

>---------------------------------------------------------------

commit 1363c32a0d5b79c846530fa16d09076d02f29d1c
Author: Simon Peyton Jones <simonpj at microsoft.com>
Date:   Fri Jul 20 20:07:51 2012 +0100

    Do calcUnfoldingGuidance on the *non* occ-analysed RHS
    
    See Note [Calculate unfolding guidance on the non-occ-anal'd expression]
    This makes a big difference to residency (530M vs over 800M when compiling
    Cabal).
    
    This fixes the majority of the regression in Trac #6104
    
    MERGED from commit cd627bcfda713efa63b7f5290c84a5077f4976f9

>---------------------------------------------------------------

 compiler/coreSyn/CoreUnfold.lhs |   79 +++++++++++++++++++++++++-------------
 1 files changed, 52 insertions(+), 27 deletions(-)

diff --git a/compiler/coreSyn/CoreUnfold.lhs b/compiler/coreSyn/CoreUnfold.lhs
index 816d34e..85fd9eb 100644
--- a/compiler/coreSyn/CoreUnfold.lhs
+++ b/compiler/coreSyn/CoreUnfold.lhs
@@ -163,7 +163,7 @@ mkUnfolding src top_lvl is_bottoming expr
   , not (exprIsTrivial expr)
   = NoUnfolding    -- See Note [Do not inline top-level bottoming functions]
   | otherwise
-  = CoreUnfolding { uf_tmpl   	    = occ_anald_expr,
+  = CoreUnfolding { uf_tmpl   	    = occurAnalyseExpr expr,
     		    uf_src          = src,
     		    uf_arity        = arity,
 		    uf_is_top 	    = top_lvl,
@@ -173,19 +173,35 @@ mkUnfolding src top_lvl is_bottoming expr
 		    uf_is_work_free = exprIsWorkFree   expr,
 		    uf_guidance     = guidance }
   where
-    occ_anald_expr    = occurAnalyseExpr expr
-    (arity, guidance) = calcUnfoldingGuidance occ_anald_expr
-	-- Sometimes during simplification, there's a large let-bound thing	
-	-- which has been substituted, and so is now dead; so 'expr' contains
-	-- two copies of the thing while the occurrence-analysed expression doesn't
-	-- Nevertheless, we *don't* occ-analyse before computing the size because the
-	-- size computation bales out after a while, whereas occurrence analysis does not.
-	--
-	-- This can occasionally mean that the guidance is very pessimistic;
-	-- it gets fixed up next round.  And it should be rare, because large
-	-- let-bound things that are dead are usually caught by preInlineUnconditionally
+    (arity, guidance) = calcUnfoldingGuidance expr
+        -- NB: *not* (calcUnfoldingGuidance (occurAnalyseExpr expr))!
+	-- See Note [Calculate unfolding guidance on the non-occ-anal'd expression]
 \end{code}
 
+Note [Calculate unfolding guidance on the non-occ-anal'd expression]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Notice that we give the non-occur-analysed expression to
+calcUnfoldingGuidance.  In some ways it'd be better to occur-analyse
+first; for example, sometimes during simplification, there's a large
+let-bound thing which has been substituted, and so is now dead; so
+'expr' contains two copies of the thing while the occurrence-analysed
+expression doesn't.
+
+Nevertheless, we *don't* and *must not* occ-analyse before computing
+the size because 
+
+a) The size computation bales out after a while, whereas occurrence
+   analysis does not.
+
+b) Residency increases sharply if you occ-anal first.  I'm not 
+   100% sure why, but it's a large effect.  Compiling Cabal went 
+   from residency of 534M to over 800M with this one change.
+
+This can occasionally mean that the guidance is very pessimistic;
+it gets fixed up next round.  And it should be rare, because large
+let-bound things that are dead are usually caught by preInlineUnconditionally
+
+
 %************************************************************************
 %*									*
 \subsection{The UnfoldingGuidance type}
@@ -237,9 +253,17 @@ calcUnfoldingGuidance expr
       	                         , ug_size  = iBox size
       	        	  	 , ug_res   = iBox scrut_discount }
 
-        discount cbs bndr
-           = foldlBag (\acc (b',n) -> if bndr==b' then acc+n else acc) 
-		      0 cbs
+        discount :: Bag (Id,Int) -> Id -> Int
+        discount cbs bndr = foldlBag combine 0 cbs
+           where
+             combine acc (bndr', disc) 
+               | bndr == bndr' = acc `plus_disc` disc
+               | otherwise     = acc
+   
+             plus_disc :: Int -> Int -> Int
+             plus_disc | isFunTy (idType bndr) = max
+                       | otherwise             = (+)
+             -- See Note [Function and non-function discounts]
     in
     (n_val_bndrs, guidance) }
 \end{code}
@@ -549,8 +573,8 @@ funSize top_args fun n_val_args
 	-- the allocation cost, as in let(rec)
   
         --                  DISCOUNTS
-        --  See Note [Function application discounts]
-    arg_discount | some_val_args && one_call fun top_args
+        --  See Note [Function and non-function discounts]
+    arg_discount | some_val_args && fun `elem` top_args
     		 = unitBag (fun, opt_UF_FunAppDiscount)
 		 | otherwise = emptyBag
 	-- If the function is an argument and is applied
@@ -560,12 +584,6 @@ funSize top_args fun n_val_args
     		 | otherwise   	 	    = 0
         -- If the function is partially applied, show a result discount
 
-    one_call _   []                     = False
-    one_call fun (arg:args) | fun==arg  = case idOccInfo arg of
-                                           OneOcc _ one_branch _ -> one_branch
-                                           _                     -> False
-                            | otherwise = one_call fun args
-
 conSize :: DataCon -> Int -> ExprSize
 conSize dc n_val_args
   | n_val_args == 0 = SizeIs (_ILIT(0)) emptyBag (_ILIT(10))    -- Like variables
@@ -615,8 +633,8 @@ shrank binary sizes by 0.5% it also made spectral/boyer allocate 5%
 more. All other changes were very small. So it's not a big deal but I
 didn't adopt the idea.
 
-Note [Function application discount]
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Note [Function and non-function discounts]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 We want a discount if the function is applied. A good example is
 monadic combinators with continuation arguments, where inlining is
 quite important.
@@ -627,8 +645,15 @@ big it won't be inlined at its many call sites and no benefit results.
 Indeed, we can get exponentially big inlinings this way; that is what
 Trac #6048 is about.
 
-So, we only give a function-application discount when the function appears
-textually once, albeit possibly inside a lambda.
+On the other hand, for data-valued arguments, if there are lots of
+case expressions in the body, each one will get smaller if we apply
+the function to a constructor application, so we *want* a big discount
+if the argument is scrutinised by many case expressions.
+
+Conclusion:
+  - For functions, take the max of the discounts
+  - For data values, take the sum of the discounts
+
 
 Note [Literal integer size]
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~





More information about the Cvs-ghc mailing list