[Yhc] Still shrinking YHI for ARM7...question

Tue Jul 24 09:21:15 EDT 2007

Alexis Morris wrote:
> Hi Tom,
> 
> Thanks alot for the information. Those are great places to start. As far 
> as loading the bytecode goes, for now I am still looking into that, but 
> will have to load directly from memory, as you mentioned.
> 
> Cheers,
> </Alexis>

Hmm yes the bytecode loading is going to be tricky I suspect ...

You could just convert the bytecode files into C source code 
representations and then load them using the module loader as usual. The 
disadvantage of this is it's a bit space inefficient. You use the static 
memory for the bytecode files and then the bytecode loader will load 
those into dynamic memory.

The method I would recommend would be to split yhi into two parts: the 
part that loads the bytecode from the file and the part that runs the 
bytecode. Essentially your 'loader' program would load the bytecode from 
the file into memory and then output that data as C source code. That C 
source code could then be compiled and linked in with the main 
interpreter (this is actually similar to how nhc98 does it).

This is rather involved and it depends on quite a bit of knowledge of 
how Yhc works internally, so I'll describe how I'd do it in detail. 
You'll have to bear with me if I go over stuff you're already familiar 
with. Also if you have your own ideas and don't want to do it this way 
that's great too :-)

I suggest you read http://haskell.org/haskellwiki/Yhc/RTS/Heap if you 
haven't already as that describes the heap layout of Yhc. Essentially 
the idea is to take the nodes loaded into the heap by the loader and 
convert them into C source code that does the same thing.

Ok lets start in main.c:init

/* initialize program */
void init(char* mainMod, Node** mainFunc, Node** _toplevel,
	  FInfo** _driver){
   /* inits */
   sanity_init();
   heap_init(G_options.heapSize);
#ifdef HAT
   hgm_init(mainMod);
#endif
   mod_init();

   /* load all globals */
   initGlobals(mainMod, mainFunc, _toplevel, _driver);

   /* initialize the threads system */
   yhi_thread_init();
   hsffi_init();

   /* finished with the module system now */
   /* mod_exit(); ... not any more, now we still need it! */
}

After calling 'initGlobals' all the modules are stored in the G_modules 
hashtable in modules.c. You can walk over this hashtable to get a list 
of all module objects, with something like:

   void process_modules(){
     for (int i = 0; i < G_modules->size; i++){
       for (HashLink* p = G_modules->table[i]; p; p = p->next){
         Module* mod = (Module*)p->value;
         process_module(mod);
       }
     }
   }

Then you need to process the Objects in each module. The code to do this 
might be something like:

   void process_module(Module* mod){
     for (int i = 0; i < mod->lookup->size; i++){
       for (HashLink* p = mod->lookup->table[i]; p; p = p->next){
          ObjectRef* ref = (ObjectRef*)p->value;
          Object* obj = ref->object;
          if (!obj){
	    // then this object was known about but never actually
             // used so skip it
             continue;
          }
          process_object(mod, p->key, obj);
       }
     }

Each Object has an Info, or a Node or both.

   void process_object(Module* mod, Char* name, Object* obj){
     if (obj->info){
       process_info(mod, name, obj->info);
     }
     if (obj->node){
       process_node(mod, name, obj->node);
     }
   }

Looking in node.h we can see that an Info has a tag, this tag says what 
kind of Info this is. There are three sensible values that you will 
actually see: I_FINFO, I_XINFO and I_CINFO.

   void process_info(Module* mod, Char* name, Info* info){
     switch(info->tag){
     case I_FINFO: process_finfo(mod, name, (FInfo*)info); break;
     case I_XINFO: process_xinfo(mod, name, (XInfo*)info); break;
     case I_CINFO: process_cinfo(mod, name, (CInfo*)info); break;
     default: abort();
     }
   }

Okay so FInfo and XInfo are very similar, so much of what applies to one 
applies to the other. If we look at 
http://haskell.org/haskellwiki/Yhc/RTS/Heap we can see that an FInfo (or 
XInfo) is laid out in memory exactly as:

    +-----------------+
    | PInfo   0 / n   |            size 0, need n
    +-----------------+
    | PInfo   1 / n-1 |            size 1, need n-1
    +-----------------+
            ...
    +-----------------+
    | PInfo   n / 0   |            size n-1, need 0
    +-----------------+
    | FInfo   ....    |            function info table

i.e. The PInfos associated with that FInfo, directly followed by the 
FInfo itself. We need to generate C source code that does the same 
thing. For example for the C source code for the PInfos and FInfo for 
the Prelude.sum function might look like:

PInfo  pinfo_Prelude_sum[] = {
	(PInfo){ { P_INFO }, 0, 1 },
	(PInfo){ { P_INFO }, 1, 0 }
};
FInfo  finfo_Prelude_sum = ...

If my memory of C serves me correctly if we compiled and linked that 
source code it it would load them into memory consecutively (and without 
gaps).

Our function to process finfos  would thus look like

   void process_finfo(Module* mod, Char* name, FInfo* finfo){
      // process pinfos
      process_pinfos(mod, name, finfo->arity, finfo->papTable);

      // write finfo structure
      fprintf(cSrcFile, "FInfo finfo_%s_2E%s = (FInfo){\n",
		mod->name, name);

      ... write all the finfo fields
   }

Now here I've just printed the mod->name and name, but in actual fact 
you'd need to escape them first to ensure they were valid C identifiers. 
This is because the module names could contain '.' and ';'(another 
separator) and object names could refer to operators (like '+').
The escaping system used by nhc was that any non-alpha-numeric was 
converted to _XX where XX was the hexadecimal of the ASCII. 'Prelude.+' 
would thus be: Prelude__2B. I've used this system in my examples but any 
non-clashing system would do.

process_finfo would write out the FInfo structure to the C source file. 
This would mostly just follow the definition in node.h, but there are a 
few tricky items in the structure.
    - papTable: this always points to the first pinfo associated with the
                finfo (i.e. PInfo 0 / n).
    - link: just use NULL it's only used inside the GC
    - module: again use NULL, only used in the reflection API
    - name: NULL is fine again, only needed for debugging etc.
    - code: you'll need to allocate a block with the instructions in it
            like

            static Byte bytecode_Prelude_2Esum[] = { 0xA4, 0x43, ... };

            then output the pointer as &bytecode_Prelude_2Esum
    - constTable: described below

The constant table is used to allow a FInfo to access other constants 
and functions. Each item in the table is a ConstItem and has an 
associated const type, you might process the constants as

    void process_constants(Module* mod, Char* name, FInfo* finfo){
      for (int i = 0; i < finfo->numConsts; i++){
        ConstType type = finfo->constTypes[i];
        ConstItem item = finfo->constItem[i];
        if (type == CI_INFO){
          process_info_constant(mod, name, (Info*)item);
        }else if (type == CI_NODE){
          process_node_constant(mod, name, (Node*)node);
        }
      }
    }

A CI_INFO links to an Info structure of some sort, and a CI_NODE links 
to a heap node. The constant table needs to contain *links* to the 
structures in memory so you'll want your generated constant table to 
look something like:

    ConstItem consttable_Prelude_2Esum = {
       &finfo_Prelude_2E_2B, &finfo_Prelude_2Esum
    };

Fortunately you can do this because every Info that the constant will 
point to will be a FInfo, XInfo or CInfo. These all store their parent 
module and their name so you can work out what the reference to them is 
called.

Okay so that does with process_finfo. process_pinfos would be something like

    void process_pinfos(Module* mod, Char* name, Int num, PInfo* pinfos){
       fprintf(cSrcFile, "PInfo pinfo_%s_2E%s[] = {\n",
		mod->name, name);
       for (int i = 0; i < num; i++){
         fprintf(cSrcFile, "(PInfo){ { I_PINFO }, %d, %d },\n",
           pinfos[i].size, pinfos[i].need);
       }
       fprintf(cSrcFile, "};\n");
    }

process_cinfo should be easy, but process_xinfo will be hard. The 
trickiest point is the ffiFunc field.
Creating a FFIFunction structure is okay but the 'funcPtr' field in
FFIFunction (see hsffi.h) is tricky. You can't output a function pointer 
as C source so you need the name of the function as it appears in the 
runtime source code. Your best option is probably to use 
primitive.c:G_primFuncs, this maps names to function pointers. Using 
this you could do a reverse lookup from pointers to function names. In 
the vast majority of cases the name of the function in the source code 
is the same as the one in G_primFuncs. If it isn't you can change the C 
function name so it is named the same, you can just use compiler errors 
from your generated C code to get it to tell you when the names are 
different.

Okay Nodes are the next tricky point, the problem with them is that the 
garbage collector expects them to be in it's malloced heap area so if 
you allocate them in static memory it'll have a tantrum. Fortunately the 
changes to stop it having a tantrum are (I think) fairly small. In 
mark.c:mrk_isMarked, change

   if (!(p >= (Node*)G_hpStart && p < (Node*)G_hp)){
     ASSERT(p >= (Node*)G_hpStart && p < (Node*)G_hp);
   }

to

   if (!(p >= (Node*)G_hpStart && p < (Node*)G_hp)){
     return true;
   }

in mark.c:mrk_mark

   ASSERT(p >= (Node*)G_hpStart && p < (Node*)G_hp);

to

   if (!(p >= (Node*)G_hpStart && p < (Node*)G_hp)){
     return true;
   }

These two changes make the mark process ignore nodes outside the heap, 
the final one is:

jonkers.c:jonk_thread, change

   ASSERT(*p >= (Node*)G_hpStart && *p <= (Node*)G_hp);

to

  if (!(*p >= (Node*)G_hpStart && *p <= (Node*)G_hp)){
    return;
  }

Now hopefully that should work but the GC is a bit temperamental, so if 
you can't get it to work then you could try loading the nodes into the 
heap in a pre-processing pass.

As for how to name the generated nodes, I would suggest just using the 
printed hexadecimal of the pointer. i.e. a Node* might be

   Node node_0x34a82b = ...

The extra complication is 'unusual nodes' such as INode, LongNode,
FloatNode, DoubleNode, StringNode. Fortunately you can tell when you've 
got one of these by looking at the infos in primitive.h. INode's always 
have the info &G_infoInt, FloatNodes always &G_infoFloat, etc. You can 
test the info of the node you've got an generate the right one.

   void process_node(Node* node){
      Info* info = NODE_INFO(node);
      if (info == &G_nodeInt){
        process_int_node((INode*)node);
      }else if (info == &G_nodeFloat){
        process_float_node((FloatNode*)node);
      }
      ...
      }else{
        // normal zero arg node case
        fprintf(cSrcFile, ...);
      }
   }

The only 'special' nodes you'll see are: G_infoInt, G_infoString, 
G_infoDouble, G_infoFloat and G_infoInteger.

So that's pretty evil, but I believe if you do that then you can load 
bytecode, in very small amounts of memory. I've included a little 
example of the generated C source code for two simple functions (one 
FInfo and one XInfo) ...

Enjoy ;-)

Tom

-------------- next part --------------
/*

this is the C code code that might be generated for the Haskell source

sum :: [Int] -> Int
sum []     = 0
sum (x:xs) = x + sum xs

Of course you'd also need a header file declaring all these items (to allow cyclic definitions)

*/

// the PInfo and FInfos
PInfo pinfo_Prelude_2Esum[] = {
    (PInfo){ { I_PINFO }, 0, 1 },           // 0 arguments present, 1 remaining
    (PInfo){ { I_PINFO }, 1, 0 },           // 1 arguments present, 0 remaining
};

FInfo finfo_Prelude_2Esum = (FInfo){
        { I_FINFO },
        pinfo_Prelude_2Esum,                    // papTable
        NULL,                                   // link
        2,                                      // arity
        0,                                      // stack (unused)
        0x00,                                   // flags (FFL_NONE, unused for your application)
        NULL,                                   // module name (ignored)
        NULL,                                   // function name (ignored)
        23,                                     // code size in bytes
        &bytecode_Prelude_2Esum,                // pointer to code
        3,                                      // number of constants in the const table
        &ctypes_Prelude_2Esum,                  // pointer to constant types
        &ctable_Prelude_2Esum                   // constant table items
    };

// The bytecode for the function
UInt8 bytecode_Prelude_2Esum[] = { 
    0x43, 0xA2, 0x49 .... 
};

// the constant types
UByte ctypes_Prelude_2Esum[] = { C_INFO, C_INFO, C_NODE };

// the constant table
ConstItem ctable_Prelude_2Esum[] = {
    (ConstItem)&finfo_Prelude_2E_2B,          // Prelude.+
    (ConstItem)&finfo_Prelude_2Esum,          // Prelude.sum
    (ConstItem)&node_0x43A22917,            // a node representing the Int '0'
};

// the application of sum to zero arguments, this is shared since
// it is always the same.
Word node_0x45BB1934[] = { (Word)&pinfo_Prelude_2Esum[0] };

// a node representing the Int '0'
INode node_0x43A22917[] = (INode){ 
    { (Word)&G_infoInt },       // node info
    0                           // node value
};

/*

This is the C code for a primitive function, like Yhc.Primitive.primExitWith 

*/
PInfo pinfo_Yhc_2EPrimitive_2EprimExitWith[] = {
    (PInfo){ { I_PINFO }, 1, 0 },
    (PInfo){ { I_PINFO }, 0, 1 }
};

XInfo xinfo_Yhc_2EPrimitive_2EprimExitWith[] = {
    { I_XINFO },                                    // info tag
    pinfo_Yhc_2EPrimitive_2EprimExitWith,             // pap table
    NULL, 1, 0, NULL, NULL,                         // link, arity, stack, module, name
    &ffi_Yhc_2EPrimitive_2EprimExitWith
};

FFIFunction ffi_Yhc_2EPrimitive_2EprimExitWith[] = {
    CC_PRIMITIVE,                                          // calling convention (CC_PRIMITIVE for primitive)
    1,                                                     // arity
    { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,      // arg types (can be ignored)
      0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
      0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
      0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, },  
    0,                                                     // ret type (can also be ignored)
    &primExitWith,                                         // pointer to C function in builtin/System.c   
};  
// note you could save space here by trimming the FFIFunction to be the bare minimum that you need
// in particular the arg types are especially wasteful.

// indeed you could even trim the FInfo structure to remove the unneeded items: stack, flags, module and name
// (link is necessary however).