PIC status

Wolfgang Thaller wolfgang.thaller at gmx.net
Sun Aug 1 17:56:31 EDT 2004


Sorry for not answering earlier, I was away enjoying summer on a 
Croatian island...
... and sorry for answering in such great length now.

> Great.  One question: how is PIC going to be used?  Are we able to
> generate shared libraries of GHC-compiled code and link them using the
> system's usual dynamic linker?

Yes. In fact, I already have a "Hello, world" program linked against 
dynamic libraries libHSrts.dylib, libHSbase.dylib and 
libHShaskell98.dylib on Mac OS X, and another one
doing the same thing on PowerPC Linux.

> Does this mean that the RTS linker is
> obsolete for the platforms that support PIC?

Could be made obsolete, but the system dynamic linker won't load .o 
files, we would need to link them into .so/.dylib/.dll files 
individually first, and there's need for a little more magic (and a lot 
more for Windows' business with import libraries...). There's a summary 
of use cases further down.

> What about compiling the main program against PIC libraries?  There 
> were
> some problems with this on ELF when I last looked into it:  the linker
> tries to create jump tables and messes around with static data
> references.

On sensible platforms, it should be possible for non-PIC and PIC code 
to call each other (non-PIC main program linked to PIC libraries). Of 
course, things aren't that easy on ELF. ELF tries so hard to hide 
dynamic linking from non-PIC that it keeps getting in the way, so I 
guess it won't be possible, or it will require an evil hack. I've 
managed to link PIC code in an executable to dynamic libraries on 
PowerPC Linux, so it's no show-stopper.

I've changed a lot of things in the meantime; I've managed to cram 
almost all of the platform-specific stuff into a module 
nativeGen/PositionIndependentCode.hs (or should it rather be 
DynamicLinking.hs? it actually handles both).
The ugliest part of it is now the way I collect the list of imported 
labels (needed for Darwin [ppc + i386, but i386 is unsupported anyway] 
and for PowerPC Linux; not needed for most other platforms).
I've had to put cmmToCmm into a monad of it's own whose only purpose is 
to accumulate a list of CLabels, and the NatM still accumulates a few 
more (because MachCodeGen sometimes makes up a few additional CLabels, 
namely floating point machops and constant literals).
It's not beautiful, but it works. Collecting all imported CLabels in a 
post-processing pass on the generated Instrs probably isn't any nicer, 
and it smells of space-leaks, so I haven't tried that yet.

So what's left to do is the following:
1) integration with the RTS's new .cmm files
2) intel support for PIC in the NCG (only very little left to do, just 
two insns to load %eip into a normal reg; I think I can do it)
3) intel support for PIC in the mangler - I HATE THE MANGLER!
4) revive the imported-closures-in-STGs-hack for Win32
5) Experiment with some hacks to make linking PIC with non-PIC work on 
ELF platforms
6) make the driver & the build system aware of how to build and use 
dynamic libraries (I've done everything manually so far)
7) Devise a way of telling GHC that a foreign import is imported from a 
DLL (for Windows); something like __declspec(dllimport) in Windows C 
compilers. We should have thought of this before the FFI spec was 
finalized...

Obviously, I'll tackle (1) first; I'll report back in when I've done 
that. I won't even look at (3) unless somebody somebody forces me to at 
gunpoint. I'll probably try to delegate as much of (6) as possible to 
other people - I don't like Makefiles.

Number (6) still needs some design work, too:
GHC has two flags that are of importance here, -fPIC and -dynamic; they 
are basically orthogonal, so we've got up to four different kinds of 
code that could be generated. Luckily, there's no platform where all 
four kinds are necessary, so we can get away with less.

The flag -fPIC means that code can be in a dynamic library. -fPIC isn't 
needed (and has no effect) on Windows. PIC code can always be used in 
place of non-PIC code, but it is, in general, less efficient.
The flag -dynamic means that Haskell packages are in dynamic libraries. 
On Windows, this means that every package *must* be in it's own DLL, on 
other platforms, this means that every package *can* be in it's own 
shared library.
On ELF platforms, -dynamic doesn't work without -fPIC (unless (5) above 
yields some results), so I'll probably make -dynamic imply -fPIC on 
those platforms.

I can currently see the following use cases:
a) static executable
    What we have now: Haskell program + libraries in one executable.
b) executable + dynamic libs
    Executable compiled with -dynamic and linked to *.dll/*.so/*.dylib 
libraries
c) statically linked loadable plugin
    Haskell code + statically linked libraries, compiled as PIC code, 
linked
    into a .dll/.so/.dylib that will be loaded by a (foreign-language) 
application
d) dynamically linked loadable plugin
    Haskell code in a loadable dll/so/dylib, dynamically linked to 
dll/so/dylib libraries.
e) Haskell module loadable by GHCi without our home-made linker
    A single haskell module compiled with, say, --for-ghci

ad a)
We need the libraries compiled without any flags and stored as *.a 
files.
The main program doesn't need any flags either.

ad b)
We need the libraries compiled with -dynamic -fPIC and stored as 
*.dll/*.so/*.dylib files.
We can easily use those instead of the .o files for GHCi to save some 
space in the binary
distributions.
The main program has to be compiled with -dynamic. GHC should 
automatically link with the dll/so/dylib versions of the libraries.

ad c)
We need the libraries compiled with just -fPIC and stored as *.a files.
The plugin itself has to be compiled with -fPIC. The whole thing must 
be linked using a platform-dependent command that GHC should know about 
(ld -dynamic for Linux; ld -bundle for Mac OS; some dlltool stuff for 
Windows).

ad d)
We need the same libraries as for (b), and the plugin itself needs 
-dynamic -fPIC.

ad e)
It's almost the same as (d), only labelDynamic has to return True in a 
few more cases. For Windows, the driver would need to do some 
additional magic to deal with import libraries, which will get *really* 
evil whenever there's a cycle in the module graph.
On non-Windows platforms, the same .o file could be reused for any of 
the other use cases, but on Windows it would be just for GHCi.

So that's two new ways for the libraries, but we can drop the separate 
GHCi libraries.
On Windows, the libraries for (a) and (c) are identical, so we don't 
need separate versions here. For other platforms, the libraries for (c) 
could be used as a subsitute for the libraries for (a) in order to save 
space; is that worth the complexity? ("link with foo.a if that's 
available, otherwise try foo_pic.a")
For non-Windows platforms, the distinction between -fPIC and -dynamic 
-fPIC is not very important; the latter could be used instead of the 
former without any measurable cost. However, it would still need to be 
packaged as both .a and as a shared lib, so that would only save time 
for compiling the libraries; it wouldn't save space for the binary 
distribution.

Well, that's it for now, and sorry for going on endlessly....

Cheers,

Wolfgang



More information about the Cvs-ghc mailing list