GCC, Mac OS X & the future
Manuel M T Chakravarty
chak at cse.unsw.edu.au
Mon Jun 20 04:43:28 CEST 2011
As llvm-gcc on OS X seems to require some work, I wonder whether we should by default build with the 'gcc-4.2' executable on OS X (which uses the traditional gcc backend), instead of the generic 'gcc' (probably still using 'gcc' as a fallback in configure if 'gcc-4.2' is not available). Then, when Apple makes the switch, binary GHC packages will continue to work.
PS: I am all for resolving the problems with llvm-gcc, but that will likely take a while. It'd be good to get a fix into 7.2, though.
> On 01/06/2011 13:30, Manuel M T Chakravarty wrote:
>> Simon Marlow:
>>> On 01/06/2011 07:11, Manuel M T Chakravarty wrote:
>>>> Simon Marlow:
>>>>> On 30/05/2011 14:59, Manuel M T Chakravarty wrote:
>>>>>> It is no secret that Apple moves away from the traditional GCC
>>>>>> backend to LLVM. In fact, Xcode (which bundles all command line
>>>>>> developer tools on the Mac) today comes with two flavours of gcc:
>>>>>> 'gcc' and 'llvm-gcc', which AFAIK only differ in the backend that is
>>>>>> being used. Currently, the default is the traditional GCC backend,
>>>>>> but it takes no precognition to realise that this will eventually
>>>>>> change. The 'gcc' executable will use the LLVM backend and, at least
>>>>>> for a while, the traditional backend will still be available under a
>>>>>> different name.
>>>>>> Unfortunately, GHC will break at this point as the LLVM backend does
>>>>>> not support pinned global registers. ('llvm-gcc' happily accepts the
>>>>>> register assignment, but fails with a runtime error during code
>>>>> This shouldn't be a problem. We don't use pinned global registers any more, except in one place - the GC (see rts/sm/GCTDecl.h). There it's optional, but you lose a bit of performance by not using a pinned register. It's not a huge deal.
>>>>> Have you tried building GHC with llvm-gcc? I think I tried it on the RTS a year or so ago to check the LLVM output against gcc (LLVM wasn't quite as good at the time).
>>>> Yes, I tried and it failed, while compiling the RTS, with
>>>> sorry, unimplemented: LLVM cannot handle register variable ‘R1’, report a bug
>>>> This was using the 64bit version of GHC. I'll have a closer look.
>>> Perhaps that was when compiling StgCRun.c? It doesn't actually need register variables (on x86_64 at least), but it does include the header files, so that probably needs some #ifdefery somewhere for llvm-gcc.
>> Yes, it's in 'StgCRun.c'. Ok, and how about on i386 (or do you want
>> to phase that arch out)?
> It doesn't look like the x86 code in StgCRun.c uses registers either. The sparc version does, but it could be rewritten.
>>> The other place, as I mentioned above, is rts/sm/GCTDecl.h, which will need to use a different method for declaring the garbage collector's thread-local state variable, gct. On x86_64 I found that using a fixed register was the fastest, but using a thread-local variable (the __thread modifier) also works.
>> Just to make sure I understand correctly, are you saying that using a
>> thread-local variable is already implemented as an option,
> Yes - look at the series of #ifdefs in that file, it's pretty straightforward to change how gct is declared for a particular platform.
> However, I've just done some poking around and it seems that __thread is not supported on OS X:
> see also this thread about Clang:
> It seems there might be support for __thread in the future, but not in the short term.
> It seems our very own David Peixotto tried building GHC with Clang a year ago and ran into the same thing:
> So this is less than ideal. The short term fix would be to #define gct to be a called to pthread_getspecific(). The call will be inlined - the OS X headers define pthread_getspecific in terms of some inline assembly, but the optimiser won't know anything about the inline assembly so it won't be able to common up multiple loads of gct, and that probably means it won't perform well. If that's the case, then the solution is to load up gct into a temporary in the performance-critical functions in the GC (evacuate(), scavenge_block()), and add it as an argument to inline functions. I'd rather avoid having to do all that if possible.
> If you want to benchmark the GC, there are some good programs in nofib/gc.
More information about the Cvs-ghc