[commit: ghc] ghc-7.6: Do calcUnfoldingGuidance on the *non* occ-analysed RHS (1363c32)
Paolo Capriotti
p.capriotti at gmail.com
Fri Aug 10 14:32:29 CEST 2012
Repository : ssh://darcs.haskell.org//srv/darcs/ghc
On branch : ghc-7.6
http://hackage.haskell.org/trac/ghc/changeset/1363c32a0d5b79c846530fa16d09076d02f29d1c
>---------------------------------------------------------------
commit 1363c32a0d5b79c846530fa16d09076d02f29d1c
Author: Simon Peyton Jones <simonpj at microsoft.com>
Date: Fri Jul 20 20:07:51 2012 +0100
Do calcUnfoldingGuidance on the *non* occ-analysed RHS
See Note [Calculate unfolding guidance on the non-occ-anal'd expression]
This makes a big difference to residency (530M vs over 800M when compiling
Cabal).
This fixes the majority of the regression in Trac #6104
MERGED from commit cd627bcfda713efa63b7f5290c84a5077f4976f9
>---------------------------------------------------------------
compiler/coreSyn/CoreUnfold.lhs | 79 +++++++++++++++++++++++++-------------
1 files changed, 52 insertions(+), 27 deletions(-)
diff --git a/compiler/coreSyn/CoreUnfold.lhs b/compiler/coreSyn/CoreUnfold.lhs
index 816d34e..85fd9eb 100644
--- a/compiler/coreSyn/CoreUnfold.lhs
+++ b/compiler/coreSyn/CoreUnfold.lhs
@@ -163,7 +163,7 @@ mkUnfolding src top_lvl is_bottoming expr
, not (exprIsTrivial expr)
= NoUnfolding -- See Note [Do not inline top-level bottoming functions]
| otherwise
- = CoreUnfolding { uf_tmpl = occ_anald_expr,
+ = CoreUnfolding { uf_tmpl = occurAnalyseExpr expr,
uf_src = src,
uf_arity = arity,
uf_is_top = top_lvl,
@@ -173,19 +173,35 @@ mkUnfolding src top_lvl is_bottoming expr
uf_is_work_free = exprIsWorkFree expr,
uf_guidance = guidance }
where
- occ_anald_expr = occurAnalyseExpr expr
- (arity, guidance) = calcUnfoldingGuidance occ_anald_expr
- -- Sometimes during simplification, there's a large let-bound thing
- -- which has been substituted, and so is now dead; so 'expr' contains
- -- two copies of the thing while the occurrence-analysed expression doesn't
- -- Nevertheless, we *don't* occ-analyse before computing the size because the
- -- size computation bales out after a while, whereas occurrence analysis does not.
- --
- -- This can occasionally mean that the guidance is very pessimistic;
- -- it gets fixed up next round. And it should be rare, because large
- -- let-bound things that are dead are usually caught by preInlineUnconditionally
+ (arity, guidance) = calcUnfoldingGuidance expr
+ -- NB: *not* (calcUnfoldingGuidance (occurAnalyseExpr expr))!
+ -- See Note [Calculate unfolding guidance on the non-occ-anal'd expression]
\end{code}
+Note [Calculate unfolding guidance on the non-occ-anal'd expression]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Notice that we give the non-occur-analysed expression to
+calcUnfoldingGuidance. In some ways it'd be better to occur-analyse
+first; for example, sometimes during simplification, there's a large
+let-bound thing which has been substituted, and so is now dead; so
+'expr' contains two copies of the thing while the occurrence-analysed
+expression doesn't.
+
+Nevertheless, we *don't* and *must not* occ-analyse before computing
+the size because
+
+a) The size computation bales out after a while, whereas occurrence
+ analysis does not.
+
+b) Residency increases sharply if you occ-anal first. I'm not
+ 100% sure why, but it's a large effect. Compiling Cabal went
+ from residency of 534M to over 800M with this one change.
+
+This can occasionally mean that the guidance is very pessimistic;
+it gets fixed up next round. And it should be rare, because large
+let-bound things that are dead are usually caught by preInlineUnconditionally
+
+
%************************************************************************
%* *
\subsection{The UnfoldingGuidance type}
@@ -237,9 +253,17 @@ calcUnfoldingGuidance expr
, ug_size = iBox size
, ug_res = iBox scrut_discount }
- discount cbs bndr
- = foldlBag (\acc (b',n) -> if bndr==b' then acc+n else acc)
- 0 cbs
+ discount :: Bag (Id,Int) -> Id -> Int
+ discount cbs bndr = foldlBag combine 0 cbs
+ where
+ combine acc (bndr', disc)
+ | bndr == bndr' = acc `plus_disc` disc
+ | otherwise = acc
+
+ plus_disc :: Int -> Int -> Int
+ plus_disc | isFunTy (idType bndr) = max
+ | otherwise = (+)
+ -- See Note [Function and non-function discounts]
in
(n_val_bndrs, guidance) }
\end{code}
@@ -549,8 +573,8 @@ funSize top_args fun n_val_args
-- the allocation cost, as in let(rec)
-- DISCOUNTS
- -- See Note [Function application discounts]
- arg_discount | some_val_args && one_call fun top_args
+ -- See Note [Function and non-function discounts]
+ arg_discount | some_val_args && fun `elem` top_args
= unitBag (fun, opt_UF_FunAppDiscount)
| otherwise = emptyBag
-- If the function is an argument and is applied
@@ -560,12 +584,6 @@ funSize top_args fun n_val_args
| otherwise = 0
-- If the function is partially applied, show a result discount
- one_call _ [] = False
- one_call fun (arg:args) | fun==arg = case idOccInfo arg of
- OneOcc _ one_branch _ -> one_branch
- _ -> False
- | otherwise = one_call fun args
-
conSize :: DataCon -> Int -> ExprSize
conSize dc n_val_args
| n_val_args == 0 = SizeIs (_ILIT(0)) emptyBag (_ILIT(10)) -- Like variables
@@ -615,8 +633,8 @@ shrank binary sizes by 0.5% it also made spectral/boyer allocate 5%
more. All other changes were very small. So it's not a big deal but I
didn't adopt the idea.
-Note [Function application discount]
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Note [Function and non-function discounts]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
We want a discount if the function is applied. A good example is
monadic combinators with continuation arguments, where inlining is
quite important.
@@ -627,8 +645,15 @@ big it won't be inlined at its many call sites and no benefit results.
Indeed, we can get exponentially big inlinings this way; that is what
Trac #6048 is about.
-So, we only give a function-application discount when the function appears
-textually once, albeit possibly inside a lambda.
+On the other hand, for data-valued arguments, if there are lots of
+case expressions in the body, each one will get smaller if we apply
+the function to a constructor application, so we *want* a big discount
+if the argument is scrutinised by many case expressions.
+
+Conclusion:
+ - For functions, take the max of the discounts
+ - For data values, take the sum of the discounts
+
Note [Literal integer size]
~~~~~~~~~~~~~~~~~~~~~~~~~~~
More information about the Cvs-ghc
mailing list