Fun with GHC's optimiser

Simon Peyton-Jones
Thu, 2 Nov 2000 04:46:41 -0800

I can never resist messages like these, even when I'm meant
to be doing other things.  It's very helpful when people offer
fairly precise performance-bug reports.  Thanks!

| I am wondering whether there is a particular reason why the
| optimiser doesn't pull the
|   (1)  a = NO_CCS PArray! [wild1 mba#];

This one is a definite bug.  It turns out that the head of the
before-ghci-branch doesn't have this bug, so I'm disinclined
to investigate it further.  

|   (2)  case w of wild3 {
|          I# e# ->
| As for (2), the loop would be nice and straight if that
| unboxing where outside of the loop - as it is, we break the
| pipeline once per iteration it seems

This one is a bit harder.  Basically we want to make a wrapper
for a recursive function if it's sure to evaluate its free variables.

In fact the 'liberate-case' pass (which isn't switched on in 4.08)
is meant to do just this. It's in simplCore/LiberateCase.lhs,
and it's not very complicated.  I've just tried it and it doesn't seem
to have the desired effect, but I'm sure that's for a boring reason.
If anyone would like to fix it, go ahead!

(You can't just say '-fliberate-case' on the command line to make
it go; you have to add -fliberate-case at a sensible point to the
minusOflags in driver/Main.hs.)

Incidentally, you'll find that -ddump-simpl gives you a dump that
is pretty close to STG and usually much more readable.  Most
performance bugs show up there.  -dverbose-simpl gives you more
clues about what is happening where.

| Also if somebody is looking at the attached source, I was
| wondering why, when I use the commented out code in
| `newPArray', I get a lot worse code (the STG code is in a
| comment at the end of the file).  In particular, the lambda
| abstraction is not inlined, whereas `fill' gets inlined into
| the code of which the dump is above.  Is it because the
| compiler has a lot harder time with explicit recursion than
| with fold/build?  If so, the right RULES magic should allow
| me to do the same for my own recursively defined
| combinators, shouldn't it?

I couldn't figure out exactly what you meant.  The only commented
out code is STG code.  Maybe send a module with the actual
source you are bothered about.