Difference between revisions of "Haskell for multicores"

From HaskellWiki
Jump to navigation Jump to search
(Initial stab at forkIO - is it bad?)
Line 72: Line 72:
 
* forkIO
 
* forkIO
   
''Todo''
+
'''TODO - finish'''
  +
  +
For explicit concurrency and/or parallelism, Haskell implementations have a light-weight thread system that schedules logical threads on the available operating system threads. These light and cheap threads can be created with forkIO. Full OS threads will not be discussed here beyond saying they pose a significantly higher overhead, but you create them using forkOS if truly needed.
  +
  +
<haskell>
  +
forkIO :: IO () -> IO ThreadId
  +
</haskell>
  +
  +
Lets take a simple Haskell application that hashes two files and prints the result:
  +
  +
<haskell>
  +
import Data.Digest.Pure.MD5 (md5)
  +
import qualified Data.ByteString.Lazy as L
  +
import System.Environment (getArgs)
  +
  +
main = do
  +
[fileA, fileB] <- getArgs
  +
hashAndPrint fileA
  +
hashAndPrint fileB
  +
  +
hashAndPrint f = L.readFile f >>= return . md5 >>= \h -> putStrLn (f ++ ": " ++ show h)
  +
</haskell>
  +
  +
This is a straight forward solution that hashs the files one at a time printing the resulting hash to the screen. What if we wanted to use more than one processor to hash the files in parallel?
  +
  +
One solution is to start a new thread, hash in parallel, and print the answers as they are computed:
  +
  +
<haskell>
  +
import Control.Concurrent (forkIO)
  +
import Data.Digest.Pure.MD5 (md5)
  +
import qualified Data.ByteString.Lazy as L
  +
import System.Environment (getArgs)
  +
  +
main = do
  +
[fileA,fileB] <- getArgs
  +
forkIO $ hashAndPrint fileA
  +
hashAndPrint fileB
  +
  +
hashAndPrint f = L.readFile f >>= return . md5 >>= \h -> putStrLn (f ++ ": " ++ show h)
  +
</haskell>
  +
  +
Now we have a rough program with reasonable great performance boost, which is expected given the trivially parallel computation.
  +
  +
But wait! You say there is a bug? Two, actually. One is that if the main thread is finished hashing fileB first, the program will exit before the child thread is done with fileA. The second is a potential for garbled output due to two threads writing to stdout. Both these problems can be solved using some inter-thread communication - we'll pick this example up in the MVar section.
   
 
=== Further reading ===
 
=== Further reading ===

Revision as of 00:40, 3 September 2008


GHC Haskell comes with a large set of libraries and tools for building programs that exploit multicore architectures.

This site attempts to document all our available information on exploiting such hardware with Haskell.

Throughout, we focus on exploiting shared-memory SMP systems, with aim of lowering absolute wall clock times. The machines we target are typical 2x to 32x desktop multicore machine, on which vanilla GHC will run.

Introduction

To get an idea of what we aim to do -- reduce running times by exploiting more cores -- here's a naive "hello, world" of parallel programs: parallel, naive fib. It simply tells us whether or not the SMP runtime is working:

    import Control.Parallel
    import Control.Monad
    import Text.Printf

    cutoff = 35

    fib' :: Int -> Integer
    fib' 0 = 0
    fib' 1 = 1
    fib' n = fib' (n-1) + fib' (n-2)

    fib :: Int -> Integer
    fib n | n < cutoff = fib' n
          | otherwise  = r `par` (l `pseq` l + r)
     where
        l = fib (n-1)
        r = fib (n-2)

    main = forM_ [0..45] $ \i ->
                printf "n=%d => %d\n" i (fib i)

We compile it with the `-threaded` flag:

   $ ghc -O2 -threaded --make fib.hs
   [1 of 1] Compiling Main             ( fib.hs, fib.o )
   Linking fib ...

And run it with:

   +RTS -Nx

where 'x' is the number of cores you have (or a slightly higher value). Here, on a quad core linux system:

   ./fib +RTS -N4  76.81s user 0.75s system 351% cpu 22.059 total

So we were able to use 3.5/4 of the available cpu time. And this is typical, most problems aren't easily scalable, and we must trade off work on more cores, for more overhead with communication.

Examples

Further reading

Thread primitives

Control.Concurrent Control.Concurrent

  • forkIO

TODO - finish

For explicit concurrency and/or parallelism, Haskell implementations have a light-weight thread system that schedules logical threads on the available operating system threads. These light and cheap threads can be created with forkIO. Full OS threads will not be discussed here beyond saying they pose a significantly higher overhead, but you create them using forkOS if truly needed.

    forkIO :: IO () -> IO ThreadId

Lets take a simple Haskell application that hashes two files and prints the result:

    import Data.Digest.Pure.MD5 (md5)
    import qualified Data.ByteString.Lazy as L
    import System.Environment (getArgs)

    main = do
        [fileA, fileB] <- getArgs
        hashAndPrint fileA
        hashAndPrint fileB

    hashAndPrint f = L.readFile f >>= return . md5 >>= \h -> putStrLn (f ++ ": " ++ show h)

This is a straight forward solution that hashs the files one at a time printing the resulting hash to the screen. What if we wanted to use more than one processor to hash the files in parallel?

One solution is to start a new thread, hash in parallel, and print the answers as they are computed:

    import Control.Concurrent (forkIO)
    import Data.Digest.Pure.MD5 (md5)
    import qualified Data.ByteString.Lazy as L
    import System.Environment (getArgs)

    main = do
        [fileA,fileB] <- getArgs
        forkIO $ hashAndPrint fileA
        hashAndPrint fileB

    hashAndPrint f = L.readFile f >>= return . md5 >>= \h -> putStrLn (f ++ ": " ++ show h)

Now we have a rough program with reasonable great performance boost, which is expected given the trivially parallel computation.

But wait! You say there is a bug? Two, actually. One is that if the main thread is finished hashing fileB first, the program will exit before the child thread is done with fileA. The second is a potential for garbled output due to two threads writing to stdout. Both these problems can be solved using some inter-thread communication - we'll pick this example up in the MVar section.

Further reading

Synchronisation with locks

Control.Concurrent.MVar

  • MVar

Todo

Further reading

Message passing channels

Control.Concurrent.Chan

  • Chan

Todo

Examples

Further reading

Lock-free synchronisation

Software Transactional Memory

  • STM

Todo

Further reading

Asynchronous messages

Control.Exception:asynchronous

  • Async exceptions


Todo

Examples

Further reading

Parallelism strategies

Control.Parallel

  • Parallel, pure strategies

Todo

Further reading

Data parallel arrays

Data Parallel Arrays

Todo

Further reading

Foreign languages calls and concurrency

Non-blocking foreign calls in concurrent threads.

Profiling and measurement

   +RTS -sstderr

Further reading

Todo