<div dir="ltr">great work! :) </div><div class="gmail_extra"><br><br><div class="gmail_quote">On Fri, Dec 27, 2013 at 3:21 PM, Ben Gamari <span dir="ltr"><<a href="mailto:bgamari.foss@gmail.com" target="_blank">bgamari.foss@gmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="im">Simon Marlow <<a href="mailto:marlowsd@gmail.com">marlowsd@gmail.com</a>> writes:<br>
<br>
> This sounds right to me. Did you submit a patch?<br>
><br>
> Note that dynamic linking with LLVM is likely to produce significantly<br>
> worse code that with the NCG right now, because the LLVM back end uses<br>
> dynamic references even for symbols in the same package, whereas the NCG<br>
> back-end uses direct static references for these.<br>
><br>
</div>Today with the help of Edward Yang I examined the code produced by the<br>
LLVM backend in light of this statement. I was surprised to find that<br>
LLVM's code appears to be no worse than the NCG with respect to<br>
intra-package references.<br>
<br>
My test case can be found here[2] and can be built with the included<br>
`build.sh` script. The test consists of two modules build into a shared<br>
library. One module, `LibTest`, exports a few simple members while the<br>
other module (`LibTest2`) defines members that consume them. Care is<br>
taken to ensure the members are not inlined.<br>
<br>
The tests were done on x86_64 running LLVM 3.4 and GHC HEAD with the<br>
patches[1] I referred to in my last message. Please let me know if I've<br>
missed something.<br>
<br>
<br>
<br>
# Evaluation<br>
<br>
## First example ##<br>
<br>
The first member is a simple `String` (defined in `LibTest`),<br>
<br>
helloWorld :: String<br>
helloWorld = "Hello World!"<br>
<br>
The use-site is quite straightforward,<br>
<br>
testHelloWorld :: IO String<br>
testHelloWorld = return helloWorld<br>
<br>
With `-O1` the code looks reasonable in both cases. Most importantly,<br>
both backends use IP relative addressing to find the symbol.<br>
<br>
### LLVM ###<br>
<br>
0000000000000ef8 <rKw_info>:<br>
ef8: 48 8b 45 00 mov 0x0(%rbp),%rax<br>
efc: 48 8d 1d cd 11 20 00 lea 0x2011cd(%rip),%rbx # 2020d0 <libtestzm0zi1zi0zi0_LibTest_helloWorld_closure><br>
f03: ff e0 jmpq *%rax<br>
<br>
0000000000000f28 <libtestzm0zi1zi0zi0_LibTest2_testHelloWorld_info>:<br>
f28: eb ce jmp ef8 <rKw_info><br>
f2a: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1)<br>
<br>
### NCG ###<br>
<br>
0000000000000d58 <rH1_info>:<br>
d58: 48 8d 1d 71 13 20 00 lea 0x201371(%rip),%rbx # 2020d0 <libtestzm0zi1zi0zi0_LibTest_helloWorld_closure><br>
d5f: ff 65 00 jmpq *0x0(%rbp)<br>
<br>
0000000000000d88 <libtestzm0zi1zi0zi0_LibTest2_testHelloWorld_info>:<br>
d88: eb ce jmp d58 <rH1_info><br>
<br>
<br>
With `-O0` the code is substantially longer but the relocation behavior<br>
is still correct, as one would expect.<br>
<br>
Looking at the definition of `helloWorld`[3] itself it becomes clear that<br>
the LLVM backend is more likely to use PLT relocations over GOT. In<br>
general, `stg_*` primitives are called through the PLT. As far as I can<br>
tell, both of these call mechanisms will incur two memory<br>
accesses. However, in the case of the PLT the call will consist of two<br>
JMPs whereas the GOT will consist of only one. Is this a cause for<br>
concern? Could these two jumps interfere with prediction?<br>
<br>
In general the LLVM backend produces a few more instructions than the<br>
NCG although this doesn't appear to be related to handling of<br>
relocations. For instance, the inexplicable (to me) `mov` at the<br>
beginning of LLVM's `rKw_info`.<br>
<br>
<br>
## Second example ##<br>
<br>
The second example demonstrates an actual call,<br>
<br>
-- Definition (in LibTest)<br>
infoRef :: Int -> Int<br>
infoRef n = n + 1<br>
<br>
-- Call site<br>
testInfoRef :: IO Int<br>
testInfoRef = return (infoRef 2)<br>
<br>
With `-O1` this produces the following code,<br>
<br>
### LLVM ###<br>
<br>
0000000000000fb0 <rLy_info>:<br>
fb0: 48 8b 45 00 mov 0x0(%rbp),%rax<br>
fb4: 48 8d 1d a5 10 20 00 lea 0x2010a5(%rip),%rbx # 202060 <rLx_closure><br>
fbb: ff e0 jmpq *%rax<br>
<br>
0000000000000fe0 <libtestzm0zi1zi0zi0_LibTest2_testInfoRef_info>:<br>
fe0: eb ce jmp fb0 <rLy_info><br>
<br>
### NCG ###<br>
<br>
0000000000000e10 <rI3_info>:<br>
e10: 48 8d 1d 51 12 20 00 lea 0x201251(%rip),%rbx # 202068 <rI2_closure><br>
e17: ff 65 00 jmpq *0x0(%rbp)<br>
<br>
0000000000000e40 <libtestzm0zi1zi0zi0_LibTest2_testInfoRef_info>:<br>
e40: eb ce jmp e10 <rI3_info><br>
<br>
Again, it seems that LLVM is a bit more verbose but seems to handle<br>
intra-package calls efficiently.<br>
<br>
<br>
<br>
[1] <a href="https://github.com/bgamari/ghc/commits/llvm-dynamic" target="_blank">https://github.com/bgamari/ghc/commits/llvm-dynamic</a><br>
[2] <a href="https://github.com/bgamari/ghc-linking-tests/tree/master/ghc-test" target="_blank">https://github.com/bgamari/ghc-linking-tests/tree/master/ghc-test</a><br>
[3] `helloWorld` definitions:<br>
<br>
LLVM:<br>
00000000000010a8 <libtestzm0zi1zi0zi0_LibTest_helloWorld_info>:<br>
10a8: 50 push %rax<br>
10a9: 4c 8d 75 f0 lea -0x10(%rbp),%r14<br>
10ad: 4d 39 fe cmp %r15,%r14<br>
10b0: 73 07 jae 10b9 <libtestzm0zi1zi0zi0_LibTest_helloWorld_info+0x11><br>
10b2: 49 8b 45 f0 mov -0x10(%r13),%rax<br>
10b6: 5a pop %rdx<br>
10b7: ff e0 jmpq *%rax<br>
10b9: 4c 89 ef mov %r13,%rdi<br>
10bc: 48 89 de mov %rbx,%rsi<br>
10bf: e8 0c fd ff ff callq dd0 <newCAF@plt><br>
10c4: 48 85 c0 test %rax,%rax<br>
10c7: 74 22 je 10eb <libtestzm0zi1zi0zi0_LibTest_helloWorld_info+0x43><br>
10c9: 48 8b 0d 18 0f 20 00 mov 0x200f18(%rip),%rcx # 201fe8 <_DYNAMIC+0x228><br>
10d0: 48 89 4d f0 mov %rcx,-0x10(%rbp)<br>
10d4: 48 89 45 f8 mov %rax,-0x8(%rbp)<br>
10d8: 48 8d 05 21 00 00 00 lea 0x21(%rip),%rax # 1100 <cJC_str><br>
10df: 4c 89 f5 mov %r14,%rbp<br>
10e2: 49 89 c6 mov %rax,%r14<br>
10e5: 58 pop %rax<br>
10e6: e9 b5 fc ff ff jmpq da0 <ghczmprim_GHCziCString_unpackCStringzh_info@plt><br>
10eb: 48 8b 03 mov (%rbx),%rax<br>
10ee: 5a pop %rdx<br>
10ef: ff e0 jmpq *%rax<br>
<br>
<br>
NCG:<br>
<br>
0000000000000ef8 <libtestzm0zi1zi0zi0_LibTest_helloWorld_info>:<br>
ef8: 48 8d 45 f0 lea -0x10(%rbp),%rax<br>
efc: 4c 39 f8 cmp %r15,%rax<br>
eff: 72 3f jb f40 <libtestzm0zi1zi0zi0_LibTest_helloWorld_info+0x48><br>
f01: 4c 89 ef mov %r13,%rdi<br>
f04: 48 89 de mov %rbx,%rsi<br>
f07: 48 83 ec 08 sub $0x8,%rsp<br>
f0b: b8 00 00 00 00 mov $0x0,%eax<br>
f10: e8 1b fd ff ff callq c30 <newCAF@plt><br>
f15: 48 83 c4 08 add $0x8,%rsp<br>
f19: 48 85 c0 test %rax,%rax<br>
f1c: 74 20 je f3e <libtestzm0zi1zi0zi0_LibTest_helloWorld_info+0x46><br>
f1e: 48 8b 1d cb 10 20 00 mov 0x2010cb(%rip),%rbx # 201ff0 <_DYNAMIC+0x238><br>
f25: 48 89 5d f0 mov %rbx,-0x10(%rbp)<br>
f29: 48 89 45 f8 mov %rax,-0x8(%rbp)<br>
f2d: 4c 8d 35 1c 00 00 00 lea 0x1c(%rip),%r14 # f50 <cGG_str><br>
f34: 48 83 c5 f0 add $0xfffffffffffffff0,%rbp<br>
f38: ff 25 7a 10 20 00 jmpq *0x20107a(%rip) # 201fb8 <_DYNAMIC+0x200><br>
f3e: ff 23 jmpq *(%rbx)<br>
f40: 41 ff 65 f0 jmpq *-0x10(%r13)<br>
<br>_______________________________________________<br>
ghc-devs mailing list<br>
<a href="mailto:ghc-devs@haskell.org">ghc-devs@haskell.org</a><br>
<a href="http://www.haskell.org/mailman/listinfo/ghc-devs" target="_blank">http://www.haskell.org/mailman/listinfo/ghc-devs</a><br>
<br></blockquote></div><br></div>