patch applied (ghc): x86_64: support PIC and therefore, Mac OS X in the NCG

Wolfgang Thaller wolfgang.thaller at gmx.net
Fri Dec 8 06:57:57 EST 2006


> Yay Wolfgang :)  I'd looked into this a while back (for x86_64- 
> linux) and I have some patches lying around, at some point I'll  
> compare mine with yours.  Have you  done everything to make dyn- 
> linked binaries work on Linux?  I recall some problems with the  
> R_COPY relocations.

I think I've done almost everything, except for making sure that  
"relocatable read only data" sections will probably need to be  
generated as .data (or something like it, NOT read-only, see below),  
and for those things that I have overlooked.

The NCG includes a special case for -dynamic but not -fPIC on linux;  
it is an ugly hack (it "simulates" its own GOT), but it seems to work  
on i386-Linux. It is probably broken on x86_64 (I just saw a .long in  
there), and I consider it unnecessary, as PIC is so cheap. You should  
probably just make -dynamic imply -fPIC on x86_64-linux. I think  
that's also the way to make -dynamic work with -fvia-C.

I've done a bit more experimenting on i386-linux, too, and I have a  
few working dynamic binaries now :-).

Here's a rough brain-dump of my current knowledge of ELF-specific  
problems:

There are two relocation types that "happen" in executables that we  
never want:

a) R_COPY
b) R_JUMP_SLOT


*** ad a)

R_COPY is obviously deadly when applied to an info_label, or to  
something that doesn't have the proper size set.
R_COPY seems to "happen" whenever the linker would otherwise need to  
relocate something in a read-only-section of the main executable.
So...

movq	variable at GOT (%ebx), %eax	# OK (on i386)

# x86_64 from here on
movq	variable at GOTOFF(%rip), %rax	# OK

.data
.quad	variable

movq	variable, %rax			# NOT OK, causes R_COPY

.section .rodata
.quad	variable			# NOT OK, last time I checked

if the symbol has size zero, we don't get R_COPY but a link error. I  
could have sworn that ld used to generate R_COPY relocations for size  
0 symbols, but apparently, things are improving.
Also, for every symbol with size zero, we get a warning, even when we  
only access it the "right" way.

I think the last situation, a pointer in a read-only-section, is the  
only situation we need to worry about if the main program is compiled  
as PIC. Inside a shared library, linux will just do the right thing.  
I don't think I've done anything to insure this in the NCG, and of  
course I have no idea about the Mangler on Linux.


*** ad b)

R_JUMP_SLOT is also deadly for info labels. Once the linker has  
decided to generate a single R_JUMP_SLOT relocation for a given  
symbol, it always thinks of that symbol as code, and makes *all*  
references to the symbol point to the PLT entry for the symbol. (The  
pointer to the PLT entry gets stored in a closure's code pointer, and  
the RTS tries to interpret the previous PLT entry as an info table...)

Also, during a tail call, the stack is misaligned, which might or  
might not be a problem for the dynamic linker's symbol lookup code.  
It is a problem on Mac OS X.

Therefore, we must never jmp to an info label. We must always use a  
computed jump. The NCG should already make sure of this, no idea  
about the x86_64 mangler.


That's all for now,

Cheers,

Wolfgang



More information about the Cvs-ghc mailing list