[Haskell-cafe] GHC optimisations

Tue Aug 21 22:13:46 EDT 2007

Isaac Dupree wrote:
> Simon Peyton-Jones wrote:
> 
>> ...
>> No, constant folding is part of the compiler, I'm afraid, in the 
>> module PrelRules.
>>
>> Simon
> 
> 
> _Constant_ folding is, but in GHC.Base there are rules like (unboxed) 
> multiplying by zero or one, or adding or subtracting zero, from an 
> unknown other (non-constant) value.  I think shifts might be doable via 
> RULES... if you were willing to make one rule for each denominator 2, 4, 
> 8 and so on, which rather depends on max. Int... (and that's not 
> Integers either, I guess)

Just to see what this would look like.

First of all, optimizing mod and div can not be done with PrelRules, 
because they are not primitives, quot and rem are. And most of the nice 
optimizations with shifts no longer work there. But using rules should 
work, assuming the inliner is not too fast.

Multiplication and division can become shifts:

 > {-# RULES
 >
 > -- x * 2^n  -->  x `shiftL` n
 > "x# *# 2#"  forall x#.  x# *# 2# = x# `iShiftL#` 1#
 > "2# *# x#"  forall x#.  2# *# x# = x# `iShiftL#` 1#
 >   -- etc.
 >
 > -- x `div` 2^n  -->  x `shiftR` n
 > "x# `divInt#` 2#"  forall x#.  divInt# x# 2# = x# `iShiftRA#` 1#
 > "x# `divInt#` 4#"  forall x#.  divInt# x# 4# = x# `iShiftRA#` 2#
 >   -- etc.

Mod can become and:

 > -- x `mod` 2^n  -->  x .&. (2^n - 1)
 > "x# `modInt#` 2#"  forall x#.  modInt# x# 2# = andInt# x# 1#
 > "x# `modInt#` 4#"  forall x#.  modInt# x# 4# = andInt# x# 3#
 >   -- etc.
 >
 >   #-}

Here I use a new function (see instance Bits Int),

 > andInt# :: Int# -> Int# -> Int#
 > andInt# x# y# = word2Int# (int2Word# x# `and#` int2Word# y#)

but you could write that inline as well.

A problem with these rules is that you need a whole lot of them. 32 per 
operation (on a 32 bit platform), * 4 operations, * 2 separate versions 
for words and ints = 256.

Other rules that could be interesting are:
 > forall a b. fromInteger a + fromInteger b = fromInteger (a + b)
 > forall a b. fromInteger a * fromInteger b = fromInteger (a * b)
 > -- etc.
To allow optimizations on generic Num code, although I am not sure what 
the Haskell spec has to say about this.

Now, if you want to get really creative you can use other semi-evil 
optimization tricks for quot and rem.
The following is based on code generated by Visual C++:
 > -- remPowInt x y == x `rem` (2^y)
 > remPowInt x y
 >     | r >= 0     =  r
 >     | otherwise  =  ((r - 1) .|. (complement yWithSign)) + 1
 >   where  r = x .&. yWithSign
 >          yWithSign = (1 `shiftL` (bitSize - 1)) .|.
 >                      ((1 `shiftL` y) - 1)
Or in assembly (for y == 2, so x `rem` 4)
 >  and         ecx,80000007h
 >  jns         main+60h (401060h)
 >  dec         ecx
 >  or          ecx,0FFFFFFF8h
 >  inc         ecx

The C++ compiler also performs other optimizations when multiplying with 
other constants, for example *3 becomes something like
 >  lea         eax, [eax+eax*2]
Divisions become horrendous constructs with magic numbers,
 >   -- eax := ecx / 5
 >  mov         eax,66666667h
 >  imul        ecx
 >  sar         edx,1
 >  mov         eax,edx
 >  shr         eax,1Fh
 >  add         eax,edx
But such things are probably best left to the code generator / a 
peephole optimizer, if they are done at all. I think the LEA trick 
should be feasible.

Twan