Benchmarks Game/Parallel/Mandelbrot

Mandelbrot

Old submission is:

http://shootout.alioth.debian.org/u32q/benchmark.php?test=mandelbrot&lang=ghc

Parallel Strategies: thread pool per line

Status: not submitted.

Flags:

   $ ghc --make -O2 -fglasgow-exts -threaded -fbang-patterns -funbox-strict-fields -optc-O2 mand3.hs
   $ ./mand3 6400 +RTS -N2

This is a version of the Haskell GHC madnelbrot benchmark which has been modified to compute output lines in parallel using a pool of worker threads. Each worker thread receives work from the work queue, computes a line of the output bitmap, and sends it back to the master. The master waits for the lines in order and emits them to the output. When compiled with -threaded and run with +RTS -N2 it uses two cores simultaneously.

On my dual core, running time goes from,

* original unparallelized, 9.315s
* dual core, 6.193s

The following flags should be used:

Compile time:

  ghc --make -O2 -fglasgow-exts -threaded -fbang-patterns -funbox-strict-fields -optc-O2

Runtime:

  ./mand3 6400 
  ./Parallel2 20 +RTS -N2

Code:

This code has extra annotations about the usage of function arguments, which can be removed.

{-# OPTIONS -O2 -fvia-C -fexcess-precision #-}
--
-- The Computer Language Benchmarks Game
-- http://shootout.alioth.debian.org/
--
-- Contributed by Spencer Janssen, Trevor McCort, Christophe Poucet and Don Stewart
--
-- Must be compiled with the -fexcess-precision flag as a pragma. GHC
-- currently doesn't recognise the -fexcess-precision flag on the command
-- line (!).
--
-- The following flags are suggested when compiling:
--
--      -O -fglasgow-exts -optc-march=pentium4
--      -fbang-patterns -funbox-strict-fields -optc-O2 -optc-mfpmath=sse -optc-msse2
--

import System
import System.IO
import Foreign
import Foreign.Marshal.Array
import Control.Concurrent
import Control.Concurrent.MVar
import Control.Concurrent.Chan
import Control.Monad

main = do
    -- width in pixels
    w <- getArgs >>= readIO . head
        -- width in bytes
    let n      = w `div` 8
        -- width of a pixel in the complex plane
        m  = 2 / fromIntegral w
        coords = [T 1 0 y (fromIntegral y * m - 1) | y <- [0..w-1]]
    q <- newChan
    replies <- replicateM w newEmptyMVar 
    mapM_ (writeChan q) $ zip coords replies
    replicateM_ 4 . forkIO $ worker q w m n

    putStrLn ("P4\n"++show w++" "++show w)
    mapM_ (takeMVar >=> \b -> hPutBuf stdout b n) replies 

-- Worker computes one line of the image and sends it to the master
-- q - work queue
-- w - width in pixels
-- m - width of a pixel in the complex plane
-- n - width in bytes
worker q w m n = forever (do
    (coord, reply) <- readChan q
    p <- mallocArray0 n
    unfold (next_x w m n) p coord
    putMVar reply p)

-- f - takes coordinates and returns Nothing if done
--     or the next byte of the bitmap otherwise.
-- ptr - buffer to write to
-- x0 - initial coordinates 
unfold :: (T -> Maybe (Word8,T)) -> Ptr Word8 -> T -> IO (Ptr Word8)
unfold !f !ptr !x0 = go ptr x0
  where
    -- p - pointer into the buffer
    -- x - coordinates
    go !p !x = case f x of
        Just (w,y)          -> poke p w >> go (p `plusPtr` 1) y
        Nothing             -> return ptr

-- T bs x y ci
--    bx - x position in bytes
--    x  - x position in pixels
--    y  - y position in pixels
--    ci - y position in complex plane
data T = T !Int !Int !Int !Double

-- w - image width in pixels
-- iw - pixel width in the complex plane
-- bw - image width in bytes
next_x !w !iw !bw (T bx x y ci)
    | bx == bw  = Nothing
    | otherwise = Just (loop_x w x 8 iw ci 0, T (bx+1) (x+8) y ci)

-- w - image width in pixels
-- x - current x coordinate in pixels
-- n - bit positition from 8 to 0
-- iw - pixel width in the complex plane
-- ci - current y coordinate in complex plane
-- b - accumulated bit value.
loop_x !w !x !n !iw !ci !b
    | x < w = if n == 0
                    then b
                    else loop_x w (x+1) (n-1) iw ci (b+b+v)
    | otherwise = b `shiftL` n
  where
    v = fractal 0 0 (fromIntegral x * iw - 1.5) ci 50

-- julia function (r :+ i) (cr :+ ci) with max iterations k.
fractal :: Double -> Double -> Double -> Double -> Int -> Word8
fractal !r !i !cr !ci !k
    | r2 + i2 > 4 = 0
    | k == 0      = 1
    | otherwise   = fractal (r2-i2+cr) ((r+r)*i+ci) cr ci (k-1)
  where
    (!r2,!i2) = (r*r,i*i)

Benchmarks Game/Parallel/Mandelbrot

Mandelbrot

Parallel Strategies: thread pool per line

Navigation menu

Search