Ministg

From HaskellWiki
Revision as of 06:06, 20 August 2009 by Bjpop (talk | contribs)
Jump to navigation Jump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

Ministg

Ministg is an interpreter for a high-level, small-step, operational semantics for the STG machine. The STG machine is the abstract machine at the core of GHC. The operational semantics used in Ministg is taken from the paper Making a fast curry: push/enter vs. eval/apply for higher-order languages by Simon Marlow and Simon Peyton Jones (hereafter referred to as the "fast curry" paper). Please consult the paper for a detailed description of the semantic rules.

Ministg follows the rules in the paper very closely, to the point that each evaluation rule in the paper corresponds to an equation in the interpreter (specifically: equations of the function smallStep). This makes is easy to check that the interpreter follows the rules properly, and it also makes is easy to understand how new rules can be added, and what effect they might have.

Motivation

The main motivations for Ministg are twofold:

  1. To illustrate the dynamic behaviour of the STG machine.
  2. To provide a simple testbed for STG extensions.

It is expected that Ministg will be useful two main groups of people:

  1. Researchers who are working on the STG machine (and GHC's backend in general).
  2. Students who are learning about the implementation of programming languages (especially of the lazy functional kind).

Feature summary

The following is a summary of the main features of Ministg; they are discussed in more detail below:

  • Support for both the push/enter and eval/apply rules for partial function applications.
  • Optional tracing of program execution, rendered in HTML.
  • Optional garbage collection.
  • Optional standard Prelude of useful functions (much like Haskell).
  • Optional call-stack tracing and stack dumping on errors.
  • Program annotations for call-stack tracing, similar to GHC's scc annotation for cost centre stacks.
  • Optional automatic call-stack annoation of top-level declarations in the program, similar to GHC's --auto-all option for profiling.

Download

You can get the latest version of the code from the darcs repository hosted on patch-tag using this command:

 darcs get http://patch-tag.com/r/ministg/pullrepo ministg

Source language

Ministg accepts programs written in the core syntax from the "fast curry" paper. At the moment programs are not type checked. Here is a definition of a program which sums a list of three numbers:

nil = CON(Nil);
one = CON(I 1);
two = CON(I 2);
three = CON(I 3);

plusInt = FUN(x y ->
   case x of {
      I i -> case y of {
               I j -> case plus# i j of {
                         x -> let { result = CON (I x) } in result }}});

foldl = FUN(f acc list ->
   case list of {
      Nil -> acc;
      Cons h t -> let { newAcc = THUNK(f acc h) } in foldl f newAcc t
   });

# lazy sum with a well-known space leak
sum = FUN(list -> foldl plusInt zero list);

list1 = CON(Cons one nil);
list2 = CON(Cons two list1);
list3 = CON(Cons three list2);

main = THUNK(sum list3);

There are several things to note about the syntax:

  • Layout is explicit like C: semi-colons delimit statements, and braces group blocks (let and case expressions). Semi-colons are also allowed at the end of the last statement in a block, but are not required.
  • Line comments begin with a hash character and extend to the end of the line.
  • Whitespace is ignored except (of course) to delimit variable and keyword tokens.
  • A few primitive operators are provided, such as plus#. They all end in a hash character.
  • Integer literals represent unboxed integers. Boxed integers (like the Int type in Haskell) must use the I data constructor.

Modules are not supported, so all of the code for your program must be written in one file. However, for convenience, Ministg provides a Prelude of standard functions, which is automatically imported into your program (but it can be disabled if necessary). Variables defined at the top-level of your source program take precedence of variables defined in the Prelude, so you can re-define Prelude functions if you want to.

Command line arguments

Short form Long form Meaning
-a --annotate automatically annotate the program with stack markers
-s STYLE --style=STYLE evaluation STYLE to use (EA = eval apply, PE = push enter)
-t --trace record a trace of program evaluation
--tracedir=DIR directory (DIR) to store trace files
-m STEPS --maxsteps=STEPS maximum number of evaluation STEPS to record in trace
-c --callstack enable call stack tracing
--noprelude do not import the Prelude
--nogc disable garbage collector
-d DUMPED --dump=DUMPED output DUMPED for debugging purposes (ast, parsed, arity)
-v --version show version number
-h --help get help about using this program

Execution tracing

One of the most useful features of Ministg is its ability to trace the steps of program execution. If tracing is enabled, Ministg will save each step of execution to a HTML file. Each such file contains the entire state of the STG machine at that point in the computation: namely the code, stack and heap. Here is an example trace. The files are linked together allowing you to step forwards and backwards through the execution.

Evaluation style: push/enter versus eval/apply

The "fast curry" paper provides two different sets of rules for the operational semantics, which differ in the way that applications of higher-order functions are handled. Ministg supports both sets of rules. By default it will use the push/enter rules, but you can specify which rule set to use with the -s command line argument (see the documentation for command line arguments above).

For example, to use the eval/apply rules, invoke Ministg like so:

ministg -s ea

or equivalently:

ministg --style=ea

Assuming there are no bugs in the implementation of Ministg, both rule sets should produce the same final answers for all valid input programs, including non-termination. Obviously they can lead to different execution traces in programs which use higher-order function application.

Runtime errors

Call stack tracing

Stack annotations

Automatic program annotation