Someone else had done most of the hard work long ago to port GHC to HPUX PA-RISC (thank you!), but the port had fallen into desuetude. This eased my task quite a bit but there were a couple problems whose source is undoubtedly changes made to GHC since the original port. One problem is that PA-RISC is very strict on its alignment requirements. Data of a given size must be aligned on an address boundary of that size. This caused problems for GMP. By default GMP uses 64-bit operations under hppa2.0n but GHC does not guarantee 64-bit alignment for the memory that it passes to GMP. This causes bus error core dumps. I spent some time looking into fixing the alignment in GHC but in the end found it more expedient to compile GMP configured for 32-bit operations. No doubt this is a performance hit and should be revisited. A more serious problem is GCC bug 32820. For a registerised port GHC instructs GCC to keep certain global STG variables in machine registers. The GCC optimizer however erroneously eliminates stores to these registers in some cases, I assume because it has lost track of the fact that they are really global variables, not just machine registers. I was unable to find a GCC version that did not have this bug, or an option flag combination to work around it. The bug is cross-platform and present in such old versions of GCC (2.95) that I assume that registerised builds of recent GHC versions using the mangler have not been done by anyone. As a result I had to turn off GCC optimization, which is no doubt a serious performance hit. The GHC binary itself is large. So large that I had problems linking it because branch offsets exceeded the range available for the addressing modes that GCC was using. So I had to add the -mlong-calls option to make GCC use a different mechanism for branching. This is undoubtedly another performance hit and certainly more code bloat (a 2 instruction call sequence changes to 12 instructions plus some data words). It might be possible to eliminate -mlong-calls if the GCC optimizer could be used because I suspect that it will make the code much smaller. HPUX PA-RISC addresses are a little strange. They are not flat 32-bit addresses. The top two bits actually are segment selectors. A side-effect of this is that the HPUX linker forbids mixing of text and data addresses. When the mangler moves data (e.g. info tables) into the text segment, it is necessary to also fix up the references to these data in the GHC runtime system. For this purposes there is a "mini" mode for the mangler that does this. Another peculiarity is that HPUX 11.0 has a number of small static wrapper functions in the system header files. These functions end up being inserted into just about every object created by GHC, causing significant code bloat and whining from the mangler over the presence of functions that it does not expect to be present. The fix is to modify the headers to change the functions to be "static inline" and then turn on inlining when invoking GCC. I have included patches that I made to the GCC headers for this (some of the files must be copied from /usr/include to the GCC include directory before applying the patch, because GCC does not apply any fixes to them and so does not make a private copy of them). The 6.6.1 mangler script also has a couple of bugs in it where someone referenced $symb_ inside a string where they meant to reference ${symb}_. I also fixed a few places where \1 was used where $1 was meant, in regular expression substitutions. The machine on which I did this port is about 10 years old. It has 4 hppa 2.0 processors running at 240 MHz. Builds on this machine starting with a registerised GHC take about 10 hours. Part of this is I believe due to a gmake problem or possibly something in the makefiles -- parts of the build do not seem to make use of more than one processor. Joe Buehler