[commit: ghc] newcg: fixes to the mini-inliner (fixes stage2 crashes) (9a32e71)
marlowsd at gmail.com
Tue Mar 6 17:54:11 CET 2012
On 06/03/2012 16:02, Johan Tibell wrote:
> On Tue, Mar 6, 2012 at 7:23 AM, Simon Marlow <marlowsd at gmail.com
> <mailto:marlowsd at gmail.com>> wrote:
> However it's possible that we'll replace the mini-inliner entirely.
> It has always been a bit of a hack, and the new code generator is
> quite effective at exposing its shortcomings. Edward's
> CmmRewriteAssignments pass was supposed to be the glorious
> replacement, but it is too inefficient to use (I have it turned off
> on the newcg branch right now).
> There's a bunch of loop unrolling that depends on us being able to
> inline constants, so it'd be nice if we had some forward propagation in
> place before we switch to the new GC.
what loop unrolling? In LLVM you mean?
Right now my focus is on getting the new CG to generate approximately as
good code as the old one without being too much slower. Once we've
achieved that, we can throw away the old CG, and clean up a *lot* of
code (which will improve performance a bit further). Then, with the new
CG in place we have a better basis for adding optimisations.
> In particular, it'd be nice if we
> have enough performance headroom to add the passes we need to the new CG
> without large slowdowns in compilation speed.
I agree it would be nice. But we can't expect the new CG to be as fast
as the old CG, because it uses more passes and is inherently more
flexible (that's the whole point). So we have to accept a little
slowdown in exchange for a better architecture.
> Aside: isn't the whole point of Hoopl that you're supposed to be able
> all these separate passes and have them run efficiently? If we can only
> use a few passes due to the overheads, what does Hoopl buy us?
Eventually hoopl should buy us the ability to do some great
optimisations at the C-- level, before we've split the code into small
chunks and sent it over the wall to LLVM. But right now my goal is to
get code that is no worse than before, and hopefully a bit better (I'm
getting better code in some cases already, so there's hope!).
I'm not saying we can't have expensive optimisation passes. Only that
we cannot *rely* on expensive optimisation passes to get code that isn't
completely stupid. The expensive optimisation passes should be for
eeking out that last 2%, and should be enabled by -O2.
More information about the Cvs-ghc