[Haskell-cafe] CPS versus Pattern Matching performance

Wed Jul 11 14:38:50 EDT 2007

On Wed, Jul 11, 2007 at 11:14:20AM +0100, Tony Finch wrote:
> registers or on the stack) and a case analysis branch, but a normal
> function return (predictable by the CPU) is replaced by a less-predictable
> indirect jump. Does anyone have references to a paper that discusses an

GHC does not use calls and returns.  To quote a recent paper on this
subject:

7.1   Using call/return instructions

As we mentioned in Section 2, GHC generates code that manages the
Haskell stack entirely separately from the system-supported C stack. As
a result, a case expression must explicitly push a return address, or
continuation, onto the Haskell stack; and the "return" takes the form of
an indirect jump to this address. There is a lost op- portunity here,
because every processor has built-in CALL and RET instructions that help
the branch-prediction hardware make good predictions: a RET instruction
conveys much more information than an arbitrary indirect jump.

Nevertheless, for several tiresome reasons, GHC cannot readily make use
of these instructions:

* The Haskell stack is allocated in the heap. GHC generates code
  to check for stack overflow, and relocates the stack if necessary.  In
  this way GHC can support zillions of little stacks (one per thread),
  each of which may be only a few hundred bytes long.  However,
  operating systems typically take signals on the user stack, and do no
  limit checking. It is often possible to arrange that signals are
  executed on a separate stack, however.

* The code for a case continuation is normally preceded by an
  info table that describes its stack frame layout. This arrangement
  is convenient because the stack frame looks just like a heap closure,
  which we described in Section 2. The garbage collector can now use the
  info table to distinguish the point- ers from non-pointers in the
  stack frame closure. This changes if the scrutinee is evaluated using
  a CALL instruction: when the called procedure is done, it RETurns to
  the instruction right after the call. This means that the info table
  can no longer be placed before a continuation. Thus the possible
  benefits of a CALL/RET scheme must outweigh the performance penalty of
  abandoning the current (efficient) info table layout.

Stefan
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
Url : http://www.haskell.org/pipermail/haskell-cafe/attachments/20070711/c75c3eca/attachment.bin