[commit: ghc] master: Re-do the "function application discount" (fixes Trac #6048) (980372f)
Simon Peyton Jones
simonpj at microsoft.com
Wed May 9 18:56:23 CEST 2012
Repository : ssh://darcs.haskell.org//srv/darcs/ghc
On branch : master
http://hackage.haskell.org/trac/ghc/changeset/980372f357667c1ba63b28acbf5798826890b7a5
>---------------------------------------------------------------
commit 980372f357667c1ba63b28acbf5798826890b7a5
Author: Simon Peyton Jones <simonpj at microsoft.com>
Date: Wed May 9 16:22:49 2012 +0100
Re-do the "function application discount" (fixes Trac #6048)
* Undoes Max's very aggressive function-inlining change
(see comments with Trac #6048)
* Resticts function application discount to functions
that occur just once in the body. It was the multiple
occurrences that led to the exponential behavour in
Trac #6048.
See Note [Function application discount] in CoreUnfold.
Module binary sizes are down 2% on average, which is good.
Allocations wobble about a bit, but only on a few benchmarks
and not by much, so it seems a price worth paying to avoid
exponential behaviour!
Allocs
Min -1.2%
Max +2.8%
Geometric Mean +0.0%
>---------------------------------------------------------------
compiler/coreSyn/CoreUnfold.lhs | 120 +++++++++++----------------------------
1 files changed, 34 insertions(+), 86 deletions(-)
diff --git a/compiler/coreSyn/CoreUnfold.lhs b/compiler/coreSyn/CoreUnfold.lhs
index ce729b4..4ab1bec 100644
--- a/compiler/coreSyn/CoreUnfold.lhs
+++ b/compiler/coreSyn/CoreUnfold.lhs
@@ -163,7 +163,7 @@ mkUnfolding src top_lvl is_bottoming expr
, not (exprIsTrivial expr)
= NoUnfolding -- See Note [Do not inline top-level bottoming functions]
| otherwise
- = CoreUnfolding { uf_tmpl = occurAnalyseExpr expr,
+ = CoreUnfolding { uf_tmpl = occ_anald_expr,
uf_src = src,
uf_arity = arity,
uf_is_top = top_lvl,
@@ -173,7 +173,8 @@ mkUnfolding src top_lvl is_bottoming expr
uf_is_work_free = exprIsWorkFree expr,
uf_guidance = guidance }
where
- (arity, guidance) = calcUnfoldingGuidance expr
+ occ_anald_expr = occurAnalyseExpr expr
+ (arity, guidance) = calcUnfoldingGuidance occ_anald_expr
-- Sometimes during simplification, there's a large let-bound thing
-- which has been substituted, and so is now dead; so 'expr' contains
-- two copies of the thing while the occurrence-analysed expression doesn't
@@ -501,80 +502,6 @@ sizeExpr bOMB_OUT_SIZE top_args expr
d2 -- Ignore d1
\end{code}
-Note [Function application discount]
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-I noticed that the output of the supercompiler generates a lot of code
-with this form:
-
-"""
-module Inlining where
-
-h1 k = k undefined undefined undefined
- undefined undefined undefined
- undefined undefined undefined
- undefined undefined undefined
- undefined undefined undefined
- undefined undefined undefined
-
-a = h1 (\x _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ -> x)
-b = h1 (\_ x _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ -> x)
-c = h1 (\_ _ x _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ -> x)
-d = h1 (\_ _ _ x _ _ _ _ _ _ _ _ _ _ _ _ _ _ -> x)
-e = h1 (\_ _ _ _ x _ _ _ _ _ _ _ _ _ _ _ _ _ -> x)
-f = h1 (\_ _ _ _ _ x _ _ _ _ _ _ _ _ _ _ _ _ -> x)
-g = h1 (\_ _ _ _ _ _ x _ _ _ _ _ _ _ _ _ _ _ -> x)
-h = h1 (\_ _ _ _ _ _ _ x _ _ _ _ _ _ _ _ _ _ -> x)
-i = h1 (\_ _ _ _ _ _ _ _ x _ _ _ _ _ _ _ _ _ -> x)
-j = h1 (\_ _ _ _ _ _ _ _ _ x _ _ _ _ _ _ _ _ -> x)
-k = h1 (\_ _ _ _ _ _ _ _ _ _ x _ _ _ _ _ _ _ -> x)
-l = h1 (\_ _ _ _ _ _ _ _ _ _ _ x _ _ _ _ _ _ -> x)
-m = h1 (\_ _ _ _ _ _ _ _ _ _ _ _ x _ _ _ _ _ -> x)
-n = h1 (\_ _ _ _ _ _ _ _ _ _ _ _ _ x _ _ _ _ -> x)
-o = h1 (\_ _ _ _ _ _ _ _ _ _ _ _ _ _ x _ _ _ -> x)
-p = h1 (\_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ x _ _ -> x)
-q = h1 (\_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ x _ -> x)
-r = h1 (\_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ x -> x)
-"""
-
-With GHC head the applications of h1 are not inlined, which hurts the
-quality of the generated code a bit. I was wondering why h1 wasn't
-getting inlined into each of "a" to "i" - after all, it has a manifest
-lambda argument.
-
-It turns out that the code in CoreUnfold gives a fixed discount of
-opt_UF_FunAppDiscount to a function argument such as "k" if it applied
-to any arguments. This is enough to ensure that h1 is inlined if the number
-of arguments applied to k is below a certain limit, but if many arguments are
-applied to k then the fixed discount can't overcome the size of the
-chain of apps, and h1 is never inlined.
-
-My proposed solution is to change CoreUnfold.funSize so that longer
-chains of arguments being applied to a lambda-bound function give a
-bigger discount. The motivation for this is that we would *generally*
-expect that the lambda at the callsite has enough lambdas such that
-all of the applications within the body can be beta-reduced away. This
-change might lead to over eager inlining in cases like this, though:
-
-{{{
-h1 k = k x y z
-
-{-# NOINLINE g #-}
-g = ...
-
-main = ... h1 (\x -> g x) ...
-}}}
-
-In this case we aren't able to beta-reduce away all of the
-applications in the body of h1 because the lambda at the call site
-only binds 1 argument, not the 3 allowed by the type. I don't expect
-this case to be particularly common, however.
-
-I chose the bonus to be (size - 20) so that application to 1 arg got
-same bonus as the old fixed bonus (i.e. opt_UF_FunAppDiscount, which is 60).
-If you have the bonus being (size - 40) then $fMonad[]_$c>>= with interesting
-2nd arg doesn't inline in cryptarithm2 so we lose some deforestation, and
-overall binary size hardly falls.
\begin{code}
-- | Finds a nominal size of a string literal.
@@ -615,23 +542,29 @@ funSize top_args fun n_val_args
where
some_val_args = n_val_args > 0
- -- See Note [Function application discount]
- arg_discount | some_val_args && fun `elem` top_args
- = unitBag (fun, opt_UF_FunAppDiscount + (size - 20))
+ size | some_val_args = 10 * (1 + n_val_args)
+ | otherwise = 0
+ -- The 1+ is for the function itself
+ -- Add 1 for each non-trivial arg;
+ -- the allocation cost, as in let(rec)
+
+ -- DISCOUNTS
+ -- See Note [Function application discounts]
+ arg_discount | some_val_args && one_call fun top_args
+ = unitBag (fun, opt_UF_FunAppDiscount)
| otherwise = emptyBag
-- If the function is an argument and is applied
-- to some values, give it an arg-discount
- -- See Note [Function application discount]
- res_discount | idArity fun > n_val_args = opt_UF_FunAppDiscount + (size - 20)
+ res_discount | idArity fun > n_val_args = opt_UF_FunAppDiscount
| otherwise = 0
-- If the function is partially applied, show a result discount
- size | some_val_args = 10 * (1 + n_val_args)
- | otherwise = 0
- -- The 1+ is for the function itself
- -- Add 1 for each non-trivial arg;
- -- the allocation cost, as in let(rec)
+ one_call _ [] = False
+ one_call fun (arg:args) | fun==arg = case idOccInfo arg of
+ OneOcc _ one_branch _ -> one_branch
+ _ -> False
+ | otherwise = one_call fun args
conSize :: DataCon -> Int -> ExprSize
conSize dc n_val_args
@@ -648,6 +581,21 @@ conSize dc n_val_args
-- [SDM, 25/5/11]
\end{code}
+Note [Function application discount]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+We want a discount if the function is applied. A good example is
+monadic combinators with continuation arguments, where inlining is
+quite important.
+
+But we don't want a big discount when a function is called many times
+(see the detailed comments with Trac #6048) because if the function is
+big it won't be inlined at its many call sites and no benefit results.
+Indeed, we can get exponentially big inlinings this way; that is what
+Trac #6048 is about.
+
+So, we only give a function-application discount when the function appears
+textually once, albeit possibly inside a lambda.
+
Note [Literal integer size]
~~~~~~~~~~~~~~~~~~~~~~~~~~~
Literal integers *can* be big (mkInteger [...coefficients...]), but
More information about the Cvs-ghc
mailing list