[commit: ghc] master: Re-do the "function application discount" (fixes Trac #6048) (980372f)

Simon Peyton Jones simonpj at microsoft.com
Wed May 9 18:56:23 CEST 2012


Repository : ssh://darcs.haskell.org//srv/darcs/ghc

On branch  : master

http://hackage.haskell.org/trac/ghc/changeset/980372f357667c1ba63b28acbf5798826890b7a5

>---------------------------------------------------------------

commit 980372f357667c1ba63b28acbf5798826890b7a5
Author: Simon Peyton Jones <simonpj at microsoft.com>
Date:   Wed May 9 16:22:49 2012 +0100

    Re-do the "function application discount" (fixes Trac #6048)
    
    * Undoes Max's very aggressive function-inlining change
      (see comments with Trac #6048)
    
    * Resticts function application discount to functions
      that occur just once in the body. It was the multiple
      occurrences that led to the exponential behavour in
      Trac #6048.
    
    See Note [Function application discount] in CoreUnfold.
    
    Module binary sizes are down 2% on average, which is good.
    Allocations wobble about a bit, but only on a few benchmarks
    and not by much, so it seems a price worth paying to avoid
    exponential behaviour!
    
                             Allocs
                Min           -1.2%
                Max           +2.8%
     Geometric Mean           +0.0%

>---------------------------------------------------------------

 compiler/coreSyn/CoreUnfold.lhs |  120 +++++++++++----------------------------
 1 files changed, 34 insertions(+), 86 deletions(-)

diff --git a/compiler/coreSyn/CoreUnfold.lhs b/compiler/coreSyn/CoreUnfold.lhs
index ce729b4..4ab1bec 100644
--- a/compiler/coreSyn/CoreUnfold.lhs
+++ b/compiler/coreSyn/CoreUnfold.lhs
@@ -163,7 +163,7 @@ mkUnfolding src top_lvl is_bottoming expr
   , not (exprIsTrivial expr)
   = NoUnfolding    -- See Note [Do not inline top-level bottoming functions]
   | otherwise
-  = CoreUnfolding { uf_tmpl   	    = occurAnalyseExpr expr,
+  = CoreUnfolding { uf_tmpl   	    = occ_anald_expr,
     		    uf_src          = src,
     		    uf_arity        = arity,
 		    uf_is_top 	    = top_lvl,
@@ -173,7 +173,8 @@ mkUnfolding src top_lvl is_bottoming expr
 		    uf_is_work_free = exprIsWorkFree   expr,
 		    uf_guidance     = guidance }
   where
-    (arity, guidance) = calcUnfoldingGuidance expr
+    occ_anald_expr    = occurAnalyseExpr expr
+    (arity, guidance) = calcUnfoldingGuidance occ_anald_expr
 	-- Sometimes during simplification, there's a large let-bound thing	
 	-- which has been substituted, and so is now dead; so 'expr' contains
 	-- two copies of the thing while the occurrence-analysed expression doesn't
@@ -501,80 +502,6 @@ sizeExpr bOMB_OUT_SIZE top_args expr
                                  d2  -- Ignore d1
 \end{code}
 
-Note [Function application discount]
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-I noticed that the output of the supercompiler generates a lot of code
-with this form:
-
-"""
-module Inlining where
-
-h1 k = k undefined undefined undefined
-        undefined undefined undefined
-        undefined undefined undefined
-        undefined undefined undefined
-        undefined undefined undefined
-        undefined undefined undefined
-
-a = h1 (\x _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ -> x)
-b = h1 (\_ x _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ -> x)
-c = h1 (\_ _ x _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ -> x)
-d = h1 (\_ _ _ x _ _ _ _ _ _ _ _ _ _ _ _ _ _ -> x)
-e = h1 (\_ _ _ _ x _ _ _ _ _ _ _ _ _ _ _ _ _ -> x)
-f = h1 (\_ _ _ _ _ x _ _ _ _ _ _ _ _ _ _ _ _ -> x)
-g = h1 (\_ _ _ _ _ _ x _ _ _ _ _ _ _ _ _ _ _ -> x)
-h = h1 (\_ _ _ _ _ _ _ x _ _ _ _ _ _ _ _ _ _ -> x)
-i = h1 (\_ _ _ _ _ _ _ _ x _ _ _ _ _ _ _ _ _ -> x)
-j = h1 (\_ _ _ _ _ _ _ _ _ x _ _ _ _ _ _ _ _ -> x)
-k = h1 (\_ _ _ _ _ _ _ _ _ _ x _ _ _ _ _ _ _ -> x)
-l = h1 (\_ _ _ _ _ _ _ _ _ _ _ x _ _ _ _ _ _ -> x)
-m = h1 (\_ _ _ _ _ _ _ _ _ _ _ _ x _ _ _ _ _ -> x)
-n = h1 (\_ _ _ _ _ _ _ _ _ _ _ _ _ x _ _ _ _ -> x)
-o = h1 (\_ _ _ _ _ _ _ _ _ _ _ _ _ _ x _ _ _ -> x)
-p = h1 (\_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ x _ _ -> x)
-q = h1 (\_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ x _ -> x)
-r = h1 (\_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ x -> x)
-"""
-
-With GHC head the applications of h1 are not inlined, which hurts the
-quality of the generated code a bit. I was wondering why h1 wasn't
-getting inlined into each of "a" to "i" - after all, it has a manifest
-lambda argument.
-
-It turns out that the code in CoreUnfold gives a fixed discount of
-opt_UF_FunAppDiscount to a function argument such as "k" if it applied
-to any arguments. This is enough to ensure that h1 is inlined if the number
-of arguments applied to k is below a certain limit, but if many arguments are
-applied to k then the fixed discount can't overcome the size of the
-chain of apps, and h1 is never inlined.
-
-My proposed solution is to change CoreUnfold.funSize so that longer
-chains of arguments being applied to a lambda-bound function give a
-bigger discount. The motivation for this is that we would *generally*
-expect that the lambda at the callsite has enough lambdas such that
-all of the applications within the body can be beta-reduced away. This
-change might lead to over eager inlining in cases like this, though:
-
-{{{
-h1 k = k x y z
-
-{-# NOINLINE g #-}
-g = ...
-
-main = ... h1 (\x -> g x) ...
-}}}
-
-In this case we aren't able to beta-reduce away all of the
-applications in the body of h1 because the lambda at the call site
-only binds 1 argument, not the 3 allowed by the type. I don't expect
-this case to be particularly common, however.
-
-I chose the bonus to be (size - 20) so that application to 1 arg got
-same bonus as the old fixed bonus (i.e. opt_UF_FunAppDiscount, which is 60).
-If you have the bonus being (size - 40) then $fMonad[]_$c>>= with interesting
-2nd arg doesn't inline in cryptarithm2 so we lose some deforestation, and
-overall binary size hardly falls.
 
 \begin{code}
 -- | Finds a nominal size of a string literal.
@@ -615,23 +542,29 @@ funSize top_args fun n_val_args
   where
     some_val_args = n_val_args > 0
 
-        -- See Note [Function application discount]
-    arg_discount | some_val_args && fun `elem` top_args
-    		 = unitBag (fun, opt_UF_FunAppDiscount + (size - 20))
+    size | some_val_args = 10 * (1 + n_val_args)
+         | otherwise     = 0
+	-- The 1+ is for the function itself
+	-- Add 1 for each non-trivial arg;
+	-- the allocation cost, as in let(rec)
+  
+        --                  DISCOUNTS
+        --  See Note [Function application discounts]
+    arg_discount | some_val_args && one_call fun top_args
+    		 = unitBag (fun, opt_UF_FunAppDiscount)
 		 | otherwise = emptyBag
 	-- If the function is an argument and is applied
 	-- to some values, give it an arg-discount
 
-        -- See Note [Function application discount]
-    res_discount | idArity fun > n_val_args = opt_UF_FunAppDiscount + (size - 20)
+    res_discount | idArity fun > n_val_args = opt_UF_FunAppDiscount
     		 | otherwise   	 	    = 0
         -- If the function is partially applied, show a result discount
-    size | some_val_args = 10 * (1 + n_val_args)
-         | otherwise     = 0
-	-- The 1+ is for the function itself
-	-- Add 1 for each non-trivial arg;
-	-- the allocation cost, as in let(rec)
   
+    one_call _   []                     = False
+    one_call fun (arg:args) | fun==arg  = case idOccInfo arg of
+                                           OneOcc _ one_branch _ -> one_branch
+                                           _                     -> False
+                            | otherwise = one_call fun args
 
 conSize :: DataCon -> Int -> ExprSize
 conSize dc n_val_args
@@ -648,6 +581,21 @@ conSize dc n_val_args
      -- [SDM, 25/5/11]
 \end{code}
 
+Note [Function application discount]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+We want a discount if the function is applied. A good example is
+monadic combinators with continuation arguments, where inlining is
+quite important.
+
+But we don't want a big discount when a function is called many times
+(see the detailed comments with Trac #6048) because if the function is 
+big it won't be inlined at its many call sites and no benefit results.
+Indeed, we can get exponentially big inlinings this way; that is what
+Trac #6048 is about.
+
+So, we only give a function-application discount when the function appears
+textually once, albeit possibly inside a lambda.
+
 Note [Literal integer size]
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
 Literal integers *can* be big (mkInteger [...coefficients...]), but





More information about the Cvs-ghc mailing list