HaskellWiki

Haskell | Wiki community | Recent changes
Random page | Special pages

 

Not logged in
Log in | Help

Yhc/RTS/Heap

< Yhc | RTS

Part of Yhc

(Download)

The Yhc heap is a garbage collected area of memory in which the running program introduces heap nodes. In this document I will try to use 'heap' to exclusively mean this garbage collected part of the heap. Any other memory used by Yhc will be refered to as 'memory'.

Contents

1 Heap Node structure

The only kind of structure in the Yhc heap are heap nodes. The structure of heap nodes is defined in src/runtime/BCKernel/node.h

Every heap node is layed out in memory as

+----------+----------+ . . +----------+
|  header  |  arg 0   |     |  arg n   |
+----------+----------+ . . +----------+

Where each block is exactly one machine word. The header is a bit packed structure with layout

n                     2    0 
+---------------------+----+
|  info ptr           | fl | 
+---------------------+----+

The node flags (fl) occupy bits 0 and 1 of the header, with the rest taken by the info ptr.

1.1 Node flags

To understand the meaning of the node flags it's necessary to consider the life cycle of a heap node. There are two types of heap nodes 'constructors'(CON) and 'applications'. Applications are either partial applications (PAP) or thunks awaiting evaluation (THUNK). Constructors are simple data items created by the program.

Constructors have a very simple life cycle, they are introduced into the heap and eventually they are no longer needed and are garbage collected. Applications are more complicated:

The node flags of a node reflect this life cycle: there are four possible values for the 2 flag bits, indicating the different states of the node.

Ignoring the GC flag, a constructor always has 'NORMAL' flag and an application goes from NORMAL -> HOLE -> IND.

1.2 Node Info

Every heap node which is a 'NORMAL' or a 'HOLE' has an info pointer. This pointer points to information (stored in memory, not the heap) about the heap node (representing by the structure Info in node.h). Most importantly it is the Info structure which identifies the type of the heap node: CON, THUNK or PAP.

Concurrency slightly complicates the issue here because for nodes which are "blocked" the info pointer may point to a list of processes chained together (with the original info at the end of the chain). These are the processes which are waiting on the result of the node being evaluated.

The Info structure for a Node is defined as.

struct _Info {
  HUInt    tag;
};

Tag is a half-word indicating what type of info this is, there are four possible values:

If the tag of an 'info ptr' is a CINFO that the info ptr actually points to a CInfo structure, if the tag is PINFO it points to a PInfo structure, etc.

1.2.1 CInfo

The CInfo structure has the following layout:

struct CInfo {
  HUInt        tag;     /* 'inherited' from Info */
  HUInt        size;    /* number of arguments to the constructor */
  Char*        name;    /* name of the constructor */
  HUInt        number;  /* tag number: e.g. 0 for False, 1 for True, etc. */
  HUInt        flags;   /* flags controlling GC */
};

flags is a combination of properties about the constructor which aid garbage collection:

1.2.2 PInfo

Information about an application to some function

struct PInfo {
  HUInt           tag;   /* inherited from Info */
  QUInt           size;  /* the no. of arguments does this application has */
  QUInt           need;  /* the no. of arguments this application is missing */
};
  

Which function this is an application to is not encoded directly in the PInfo structure and is instead stored by laying out things careful in memory. The information about a function (both PInfo and FInfo structures) is always layed out as:

+-----------------+
| PInfo   0 / n   |            size 0, need n
+-----------------+
| PInfo   1 / n-1 |            size 1, need n-1
+-----------------+
        ...
+-----------------+
| PInfo   n / 0   |            size n-1, need 0
+-----------------+
| FInfo   ....    |            function info table
.                 .
.                 .

This way the corresponding FInfo for a given PInfo is given by

(FInfo*)(pinfo + pinfo->need + 1)

this is calculated by the PINFO_FINFO macro in node.h

1.2.3 FInfo

Information about a function: no application will have an info pointer directly to a FInfo, instead it'll point to a PInfo from which the FInfo can be found.

struct FInfo {
   HUInt        tag;        /* inherited from Info */
   PInfo*       papTable;   /* pointer back to the first item in the
                               paptable for this function */
   FInfo*       link;       /* used by the GC */
   HUInt        arity;      /* function arity */
   HUInt        stack;      /* unused - formely function stack usage */
   Char*        name;       /* function name */
   CodePtr      code;       /* pointer to the bytecode instructions for
                               this function */
   HUInt        numConsts;  /* number of constants in the constant table */
   UByte*       constTypes; /* pointer to a table of constant table
                               type information */
   ConstItem*   constTable; /* constant table constants */
};     

Every function has a constant table which holds references to global objects that can be referenced by a function. For example a function such as:

f = g (1 : [])

would need a reference to the FInfo for 'g', as well as the CInfo for ':' and the heap node that represents '[]'. These references are stored in the constant table.

There are two types of object in the constant table: references to nodes in the heap (C_NODE) and references to other Info structures (C_INFO). The types of each constant are given by the bytes of 'constTypes': each byte gives the type of consequetive constant table items.

The constTable itself is an array of ConstItems which are pointers to either heap nodes or Info structures (obviously depending on the constant type).

NOTE: it is by recursively scanning the constant table of a functions that the module loading system knows what entries to load from the bytecode file (see Yhc/RTS/Modules)

1.2.4 XInfo

XInfo is very similar to a FInfo except it refers to an external (i.e. primitive or FFI) function. The only difference is that instead of a code pointer and constant table it has a C function pointer to the external function. The function is assumed to be of the form:

Node* function(Node*);

Where the function takes the application node (including arguments) and returns the application node of the result (see Yhc/RTS/Primitive and Yhc/RTS/Foreign).

1.2.5 TInfo

TInfo is an info structure for a thread, this is not a real Info structure per-se but it is used to store blocked process lists see Yhc/RTS/Concurrency.

Retrieved from "http://www.haskell.org/haskellwiki/Yhc/RTS/Heap"

This page has been accessed 3,026 times. This page was last modified 22:22, 19 July 2006. Recent content is available under a simple permissive license.