Personal tools

GHC/SIMD

From HaskellWiki

< GHC(Difference between revisions)
Jump to: navigation, search
(Initial entry)
Current revision (23:23, 3 November 2010) (edit) (undo)
(Remove page)
 
Line 1: Line 1:
-
== Overview ==
 
-
This page is initially to provide a location for discussions on extending GHC to take advantage of CPU SIMD instructions, including SSE and Altivec instructions.
 
-
 
-
SSE provides 'packed' data types of floats and integers that fit into 128 bit xmm registers.
 
-
 
-
The operations on these data types include the standard mathematical operations (Add/Mul/...). There are also additional mathematical operations (reciprocal, reciprocal-square-root) and packed-specific operations such as dot-product, horizontal add/sub/add-sub.
 
-
 
-
Also, to support data-streaming operations, there are memory operations that bypass the cache and write directly to/from the xmm registers.
 
-
 
-
xmm registers are 128 bits and hold both packed integer and packed float types. I suggest that a new `PackedReg` data constructor be added.
 
-
 
-
In terms of an implementation plan:
 
-
 
-
* Add new packed data types and 'standard' operations on those types to Cmm and primops.txt.pp
 
-
 
-
** Int32Packed4#, ...
 
-
 
-
** Width = ... | W32_4 | ...
 
-
 
-
* implement new types and operations in backends (C/LLVM/ASM)
 
-
 
-
So far this is straightforward.
 
-
 
-
* As has been mentioned on the developer's [http://hackage.haskell.org/trac/ghc/ticket/3557 wiki] a 'packed-size' agnostic optimising layer of vector operations would be great. It seems that this could be implemented without new primops on top of the CPU-specific primops.
 
-
 
-
* What mechanism should be used for constructing/accessing elements of a packed data type? (LLVM has a <vector n type> datatype with accessor functions).
 
-
 
-
* Stream fusion would allow complex operations for 'map'ed and 'zip'ed vectors of Floats, etc., that are optimised to make use of CPU Vectors.
 

Current revision