[commit: ghc] simd: Handle 4-byte aligned stack on Win32 when generating SSE instructions. (0f251a0)
Geoffrey Mainland
gmainlan at microsoft.com
Mon Nov 28 00:10:26 CET 2011
LLVM assumes the stack is 16-byte aligned---either because the platform
aligns the stack or because the function prologue aligns it. Therefore SSE
register spills always generate movaps instructions, necessitating my
hack. We can certainly discuss this with the LLVM folks, though I don't
see any other way around the issue in the near term. Maybe for 3.1...
Geoff
On 11/27/11 11:03 PM, "David Terei" <davidterei at gmail.com> wrote:
>Is this really the only solution? I don't understand the issue
>completely but it sounds like something you could talk to the LLVM
>folks about. I've recently fixed half the reason the LLVM Mangler is
>needed and hope to fix the other one someday and kill the Mangler so
>I'm against adding any more to it.
>
>On 27 November 2011 14:15, Geoffrey Mainland <gmainlan at microsoft.com>
>wrote:
>> Repository : ssh://darcs.haskell.org//srv/darcs/ghc
>>
>> On branch : simd
>>
>>
>>http://hackage.haskell.org/trac/ghc/changeset/0f251a04c21ca4a7de5c001d632
>>e15a6fdaf948a
>>
>>>---------------------------------------------------------------
>>
>> commit 0f251a04c21ca4a7de5c001d632e15a6fdaf948a
>> Author: Geoffrey Mainland <gmainlan at microsoft.com>
>> Date: Sat Nov 26 12:45:23 2011 +0000
>>
>> Handle 4-byte aligned stack on Win32 when generating SSE
>>instructions.
>>
>> Win32 only guarantees that the stack is 4-byte aligned, so rewrite
>>all movaps
>> instructions to movups in the mangler on that platform. We already
>>generate
>> movups for explicit loads and stores in the LLVM back-end because
>>they are
>> marked as potentially unaligned. However, LLVM generates movaps for
>>registers
>> spills, and I don't see any way to fix that except by rewriting the
>>assembly
>> output in the mangler.
>>
>>>---------------------------------------------------------------
>>
>> compiler/llvmGen/LlvmMangler.hs | 23 +++++++++++++++++++++++
>> 1 files changed, 23 insertions(+), 0 deletions(-)
>>
>> diff --git a/compiler/llvmGen/LlvmMangler.hs
>>b/compiler/llvmGen/LlvmMangler.hs
>> index 981bbf2..e042567 100644
>> --- a/compiler/llvmGen/LlvmMangler.hs
>> +++ b/compiler/llvmGen/LlvmMangler.hs
>> @@ -60,7 +60,11 @@ llvmFixupAsm f1 f2 = do
>> w <- openBinaryFile f2 WriteMode
>> ss <- readSections r w
>> hClose r
>> +#if mingw32_TARGET_OS
>> + let fixed = (map fixMovaps . fixTables) ss
>> +#else
>> let fixed = fixTables ss
>> +#endif
>> mapM_ (writeSection w) fixed
>> hClose w
>> return ()
>> @@ -107,6 +111,25 @@ writeSection w (hdr, cts) = do
>> B.hPutStrLn w hdr
>> B.hPutStrLn w cts
>>
>> +fixMovaps :: Section -> Section
>> +fixMovaps (hdr, cts) =
>> + (hdr, loop idxs cts)
>> + where
>> + loop :: [Int] -> B.ByteString -> B.ByteString
>> + loop [] cts = cts
>> +
>> + loop (i : is) cts =
>> + loop is (hd `B.append` movups `B.append` B.drop 6 tl)
>> + where
>> + (hd, tl) = B.splitAt i cts
>> +
>> + idxs :: [Int]
>> + idxs = B.findSubstrings movaps cts
>> +
>> + movaps, movups :: B.ByteString
>> + movaps = B.pack "movaps"
>> + movups = B.pack "movups"
>> +
>> -- | Reorder and convert sections so info tables end up next to the
>> -- code. Also does stack fixups.
>> fixTables :: [Section] -> [Section]
>>
>>
>>
>> _______________________________________________
>> Cvs-ghc mailing list
>> Cvs-ghc at haskell.org
>> http://www.haskell.org/mailman/listinfo/cvs-ghc
>>
>
More information about the Cvs-ghc
mailing list