Difference between revisions of "Concurrency"

From HaskellWiki
Jump to navigation Jump to search
(fixing dead link)
Line 1: Line 1:
 
[[Category:GHC|Concurrency]]
 
[[Category:GHC|Concurrency]]
== Concurrent programming in GHC ==
+
== Parallel and Concurrent Programming in GHC ==
   
This page contains notes and information about how to write concurrent programs in GHC.
+
This page contains notes and information about how to write concurrent and/or parallel programs in GHC.
   
  +
GHC provides multi-scale support for parallel programming, from very fine-grained, small "sparks", to coarse-grained explicit threads and locks, along with other models of concurrent and parallel programming, including actors, CSP-style concurrency, nested data parallelism and Intel Concurrent Collections. Synchronization between tasks is possible via messages, regular Haskell variables, MVar shared state or transactional memory.
Please feel free to add stuff here (Edit page link at the bottom).
 
  +
  +
* See "Real World Haskell" [http://book.realworldhaskell.org/read/concurrent-and-multicore-programming.html chapter 24], for an introduction to the most common forms of concurrent and parallel programming in GHC.
  +
* A [http://donsbot.wordpress.com/2009/09/03/parallel-programming-in-haskell-a-reading-list/ reading list for parallelism in Haskell].
  +
* The [http://stackoverflow.com/questions/3063652/whats-the-status-of-multicore-programming-in-haskell status of parallel and concurrent programming] in Haskell.
  +
  +
The concurrent and parallel programming models in GHC can be divided into the following forms:
  +
  +
* Very fine grained: parallel sparks and futures, as described in the paper "[http://www.haskell.org/~simonmar/bib/multicore-ghc-09_abstract.html Runtime Support for Multicore Haskell]"
  +
* Fine grained: lightweight Haskell threads, explicit synchronization with STM or MVars. See the paper "Tackling the Awkward Squad" below.
  +
* Nested data parallelism: a parallel programming model based on bulk data parallelism, in the form of the [http://www.haskell.org/haskellwiki/GHC/Data_Parallel_Haskell DPH] and [http://hackage.haskell.org/package/repa Repa] libraries for transparently parallel arrays.
  +
* Intel [http://software.intel.com/en-us/blogs/2010/05/27/announcing-intel-concurrent-collections-for-haskell-01/ Concurrent Collections for Haskell]: a graph-oriented parallel programming model.
  +
* [http://www.cs.kent.ac.uk/projects/ofa/chp/ CHP]: CSP-style concurrency for Haskell.
  +
  +
The most important (as of 2010) to get to know are the basic "concurrent Haskell" model of threads using forkIO and MVars, the use of transactional memory via STM, implicit parallelism via sparks and, if you're interested in scientific programming specifically, nested data parallelism in Haskell.
   
 
=== Starting points ===
 
=== Starting points ===
   
  +
* '''Basic concurrency: forkIO and MVars'''.
* '''Basic concurrency: forkIO and MVars'''. Read [http://research.microsoft.com/Users/simonpj/papers/marktoberdorf/marktoberdorf.ps.gz Tackling the awkward squad: monadic input/output, concurrency, exceptions, and foreign-language calls in Haskell].<p>The [http://www.haskell.org/ghc/docs/papers/concurrent-haskell.ps.gz original paper about Concurrent Haskell] contains quite a few examples about how to write concurrent programs. A larger example is [http://www.haskell.org/~simonmar/papers/web-server.ps.gz Writing High-Performance Server Applications in Haskell, Case Study: A Haskell Web Server]
 
</p>
 
 
* '''Software Transactional Memory''' (STM) is a new way to coordinate concurrent threads. There's a separate [[Software transactional memory|Wiki page devoted to STM]].
 
* '''Software Transactional Memory''' (STM) is a new way to coordinate concurrent threads. There's a separate [[Software transactional memory|Wiki page devoted to STM]].
 
: STM was added to GHC 6.4, and is described in the paper [http://research.microsoft.com/~simonpj/papers/stm/index.htm Composable memory transactions]. The paper [http://research.microsoft.com/~simonpj/papers/stm/lock-free.htm Lock-free data structures using Software Transactional Memory in Haskell] gives further examples of concurrent programming using STM.
 
: STM was added to GHC 6.4, and is described in the paper [http://research.microsoft.com/~simonpj/papers/stm/index.htm Composable memory transactions]. The paper [http://research.microsoft.com/~simonpj/papers/stm/lock-free.htm Lock-free data structures using Software Transactional Memory in Haskell] gives further examples of concurrent programming using STM.
Line 16: Line 29:
   
 
* '''Nested Data Parallelism'''. For an approach to exploiting the implicit parallelism in array programs for multiprocessors, see [[GHC/Data Parallel Haskell|Data Parallel Haskell]] (work in progress).
 
* '''Nested Data Parallelism'''. For an approach to exploiting the implicit parallelism in array programs for multiprocessors, see [[GHC/Data Parallel Haskell|Data Parallel Haskell]] (work in progress).
 
   
 
=== Using concurrency in GHC ===
 
=== Using concurrency in GHC ===
Line 24: Line 36:
 
* The GHC manual gives a few useful flags that control scheduling (not usually necessary) [http://www.haskell.org/ghc/docs/latest/html/users_guide/sec-using-parallel.html#parallel-rts-opts RTS options].
 
* The GHC manual gives a few useful flags that control scheduling (not usually necessary) [http://www.haskell.org/ghc/docs/latest/html/users_guide/sec-using-parallel.html#parallel-rts-opts RTS options].
   
 
=== Multicore GHC ===
   
 
Since 2004, GHC supports running programs in parallel on an SMP or multi-core machine. How to do it:
=== Multiprocessor GHC ===
 
   
  +
* [http://haskell.org/platform Download a recent GHC].
As of version 6.5, GHC supports running programs in parallel on an SMP or multi-core machine. How to do it:
 
   
  +
* Compile your program using the <tt>-threaded</tt> switch.
* You'll need to get a version of GHC that supports SMP. Either download a [http://www.haskell.org/ghc/dist/current/dist nightly snapshot distribution], or [http://hackage.haskell.org/trac/ghc/wiki/GhcDarcs get the sources] from darcs and build it yourself.
 
 
* You need to link your program using the <tt>-threaded</tt> switch. (NOTE: previously it was necessary to compile all code, including libraries, with the <tt>-smp</tt> switch, this is no longer the case. The <tt>-smp</tt> flag is now a synonym for <tt>-threaded</tt>).
 
   
 
* Run the program with <tt>+RTS -N2</tt> to use 2 threads, for example. You should use a <tt>-N</tt> value equal to the number of CPU cores on your machine (not including Hyper-threading cores).
 
* Run the program with <tt>+RTS -N2</tt> to use 2 threads, for example. You should use a <tt>-N</tt> value equal to the number of CPU cores on your machine (not including Hyper-threading cores).
   
* Concurrent threads (<tt>forkIO</tt> and <tt>forkOS</tt>) will run in parallel, and you can also use the <tt>par</tt> combinator and Strategies from the [http://www.haskell.org/ghc/docs/latest/html/libraries/base/Control-Parallel-Strategies.html Control.Parallel.Strategies] module to create parallelism.
+
* Concurrent threads (<tt>forkIO</tt>) will run in parallel, and you can also use the <tt>par</tt> combinator and Strategies from the [http://www.haskell.org/ghc/docs/latest/html/libraries/base/Control-Parallel-Strategies.html Control.Parallel.Strategies] module to create parallelism.
   
 
* Use <tt>+RTS -sstderr</tt> for timing stats.
 
* Use <tt>+RTS -sstderr</tt> for timing stats.
   
  +
* To debug parallel program performance, use [http://research.microsoft.com/en-us/projects/threadscope/ ThreadScope].
=== Links to related work on parallel and distributed Haskell (many based on GHC) ===
 
   
  +
=== Related work ===
  +
  +
* The Sun project to improve http://ghcsparc.blogspot.com/ GHC performance on Sparc]
  +
* A [http://www.well-typed.com/blog/38 Microsoft project to improve industrial applications of GHC parallelism].
  +
* [http://www.haskell.org/~simonmar/bib/bib.html Simon Marlow's publications on parallelism and GHC]
 
* [http://www.macs.hw.ac.uk/~dsg/gph/ Glasgow Parallel Haskell]
 
* [http://www.macs.hw.ac.uk/~dsg/gph/ Glasgow Parallel Haskell]
 
* [http://www.macs.hw.ac.uk/~dsg/gdh/ Glasgow Distributed Haskell]
 
* [http://www.macs.hw.ac.uk/~dsg/gdh/ Glasgow Distributed Haskell]
Line 46: Line 62:
 
* http://www.informatik.uni-kiel.de/~fhu/PUBLICATIONS/1999/ifl.html
 
* http://www.informatik.uni-kiel.de/~fhu/PUBLICATIONS/1999/ifl.html
 
* [http://www.mathematik.uni-marburg.de/~eden Eden]
 
* [http://www.mathematik.uni-marburg.de/~eden Eden]
 
== Problems with GHC implementation before 6.6.1 ==
 
 
There are critical differences between the description in the paper
 
"Asynchronous exceptions in Haskell by Simon Marlow, Simon Peyton
 
Jones, Andy Moran and John Reppy, PLDI'01." the implementation in GHC
 
6.4 and GHC 6.6 today.
 
 
Some of the bad effects are described here under
 
[[GHC/Concurrency/Flaws|throwTo & block statements considered
 
harmful]].
 
 
The versions of GHC from 6.6.1 and up have fixed the problematical difference.
 
 
----
 

Revision as of 03:54, 28 July 2010

Parallel and Concurrent Programming in GHC

This page contains notes and information about how to write concurrent and/or parallel programs in GHC.

GHC provides multi-scale support for parallel programming, from very fine-grained, small "sparks", to coarse-grained explicit threads and locks, along with other models of concurrent and parallel programming, including actors, CSP-style concurrency, nested data parallelism and Intel Concurrent Collections. Synchronization between tasks is possible via messages, regular Haskell variables, MVar shared state or transactional memory.

The concurrent and parallel programming models in GHC can be divided into the following forms:

  • Very fine grained: parallel sparks and futures, as described in the paper "Runtime Support for Multicore Haskell"
  • Fine grained: lightweight Haskell threads, explicit synchronization with STM or MVars. See the paper "Tackling the Awkward Squad" below.
  • Nested data parallelism: a parallel programming model based on bulk data parallelism, in the form of the DPH and Repa libraries for transparently parallel arrays.
  • Intel Concurrent Collections for Haskell: a graph-oriented parallel programming model.
  • CHP: CSP-style concurrency for Haskell.

The most important (as of 2010) to get to know are the basic "concurrent Haskell" model of threads using forkIO and MVars, the use of transactional memory via STM, implicit parallelism via sparks and, if you're interested in scientific programming specifically, nested data parallelism in Haskell.

Starting points

  • Basic concurrency: forkIO and MVars.
  • Software Transactional Memory (STM) is a new way to coordinate concurrent threads. There's a separate Wiki page devoted to STM.
STM was added to GHC 6.4, and is described in the paper Composable memory transactions. The paper Lock-free data structures using Software Transactional Memory in Haskell gives further examples of concurrent programming using STM.
  • Nested Data Parallelism. For an approach to exploiting the implicit parallelism in array programs for multiprocessors, see Data Parallel Haskell (work in progress).

Using concurrency in GHC

  • The GHC manual gives a few useful flags that control scheduling (not usually necessary) RTS options.

Multicore GHC

Since 2004, GHC supports running programs in parallel on an SMP or multi-core machine. How to do it:

  • Compile your program using the -threaded switch.
  • Run the program with +RTS -N2 to use 2 threads, for example. You should use a -N value equal to the number of CPU cores on your machine (not including Hyper-threading cores).
  • Concurrent threads (forkIO) will run in parallel, and you can also use the par combinator and Strategies from the Control.Parallel.Strategies module to create parallelism.
  • Use +RTS -sstderr for timing stats.
  • To debug parallel program performance, use ThreadScope.

Related work