FreeBSD/amd64 registerised running

Gregory Wright gwright at comcast.net
Sun Apr 8 19:49:24 EDT 2007


Hi Ian, Simon,

I have ghc-6.6 (darcs version from 20070405) running registerized on
FreeBSD/amd64.  The FreeBSD version is 6.2.

The problem with the compiler crash turned out to be simple.  In the
FreeBSD header file regex.h, regex_t is defined as

typedef struct {
         int re_magic;
         size_t re_nsub;         /* number of parenthesized  
subexpressions */
         __const char *re_endp;  /* end pointer for REG_PEND */
         struct re_guts *re_g;   /* none of your business :-) */
} regex_t;

The problem is that the "re_magic" field is defined as an int.  When  
building
the .hc files on the i386 host, the re_nsub field is at an offset of  
4.  On the
amd64 target, it is at an offset of 8.  In the ghc binding to the  
regex functions,
re_nsub is used to compute how much memory to allocate in a call to
allocaBytes.  This leads to garbage being passed to newPinnedByteArray#.

The fix is to patch libraries/base/Text/Regex/Posix.hs on the amd64  
target:

--- libraries/base/Text/Regex/Posix.hs.sav      Thu Apr  5 12:05:22 2007
+++ libraries/base/Text/Regex/Posix.hs  Thu Apr  5 12:05:45 2007
@@ -106,7 +106,7 @@
regexec (Regex regex_fptr) str = do
    withCString str $ \cstr -> do
      withForeignPtr regex_fptr $ \regex_ptr -> do
-      nsub <- ((\hsc_ptr -> peekByteOff hsc_ptr 4)) regex_ptr
+      nsub <- ((\hsc_ptr -> peekByteOff hsc_ptr 8)) regex_ptr
{-# LINE 109 "Posix.hsc" #-}
        let nsub_int = fromIntegral (nsub :: CSize)
        allocaBytes ((1 + nsub_int) * (16)) $ \p_match -> do

With this patch, we are pretty close.  However, there still seems to be
something wrong with the splitter.  I can make a working registerized
compiler if I set splitObjs=NO in build.mk, but it seems as if  
whatever is
wrong with ghc-split shouldn't be too hard to fix.

The splitting problem shows up as a linking failure.  Some variables
defined in the text section are changed from global symbols to local
symbols by the splitter.  An example (just one of several hundred
symbols that are changed from global to local):

 From building ghc-6.6-20070405 on i386:

 > nm --defined-only libHSbase.a | grep "D "

<snip>
00000000 D base_TextziReadziLex_zdLr3bklvl122_closure

and from building ghc-6.6-20070405 on amd64:

 > nm --defined-only libHSbase.a | grep "d "

<snip>
0000000000000000 d base_TextziReadziLex_zdLr3bklvl122_closure


The "D" on i386 indicates a global symbol, the "d" on amd64
a local symbol.

I've glanced at ghc-split.lprl, but on what files is it invoked? Can
I run it from the command line on a file and see check what comes out?
The file itself doesn't say what it expects as input, and the section
of the Commentary on the splitter is more than terse.

The linker is still broken (so no ghci):

greenhouse-george> ghci
    ___         ___ _
   / _ \ /\  /\/ __(_)
/ /_\// /_/ / /  | |      GHC Interactive, version 6.6.20770405, for  
Haskell 98.
/ /_\\/ __  / /___| |      http://www.haskell.org/ghc/
\____/\/ /_/\____/|_|      Type :? for help.

ghc-6.6.20770405: internal error: R_X86_64_PC32 relocation out of  
range: __isthreaded = 0xfffffff800122aad
     (GHC version 6.6.20070405 for x86_64_unknown_freebsd)
     Please report this as a GHC bug:  http://www.haskell.org/ghc/ 
reportabug
Abort trap: 6 (core dumped)

but I think I understand this.  On FreeBSD mmap does not have the
MAP_32BIT option that linux does to guarantee a mapping in first
2 GB of address space.  But by supplying a hint address in the lower
address space we can get the effect the MAP_32BIT option.  I thought I
had this fixed in the patch I applied to Linker.c, but I have  
obviously overlooked
something.

I'm continuing to work on the linker, and expect that it will be working
soon.  I'd appreciate a some guidance on the splitter question as I am
entirely unfamiliar with it.

Best Wishes,
Greg




More information about the Glasgow-haskell-users mailing list