Personal tools

Parallelism

From HaskellWiki

(Difference between revisions)
Jump to: navigation, search
(Parallel Programming in GHC)
m (swap order of pure parallelism and concurrency, put pure first)
(8 intermediate revisions by one user not shown)
Line 1: Line 1:
== Parallel Programming in Haskell ==
 
 
 
Parallelism is about speeding up a program by using multiple processors.
 
Parallelism is about speeding up a program by using multiple processors.
   
 
In Haskell we provide two ways to achieve parallelism:
 
In Haskell we provide two ways to achieve parallelism:
- Concurrency, which can be used for parallelising IO.
+
* Pure parallelism, which can be used to speed up pure (non-IO) parts of the program.
- Pure parallelism, which can be used to speed up pure (non-IO)
+
* Concurrency, which can be used for parallelising IO.
parts of the program.
 
   
[[Concurrency]] (Control.Concurrent):
+
Pure Parallelism (Control.Parallel): Speeding up a pure computation using multiple processors. Pure parallelism has these advantages:
Multiple threads of control that execute "at the same time".
+
* Guaranteed deterministic (same result every time)
- Threads are in the IO monad
+
* no [[race conditions]] or [[deadlocks]]
- IO operations from multiple threads are interleaved
 
non-deterministically
 
- communication between threads must be explicitly programmed
 
- Threads may execute on multiple processors simultaneously
 
- dangers: [[race conditions]] and [[deadlocks]]
 
   
Pure Parallelism (Control.Parallel):
+
[[Concurrency]] (Control.Concurrent): Multiple threads of control that execute "at the same time".
Speeding up a pure computation using multiple processors.
+
* Threads are in the IO monad
- Pure parallelism has these advantages:
+
* IO operations from multiple threads are interleaved non-deterministically
- guaranteed deterministic (same result every time)
+
* communication between threads must be explicitly programmed
- no [[race conditions]] or [[deadlocks]]
+
* Threads may execute on multiple processors simultaneously
  +
* Dangers: [[race conditions]] and [[deadlocks]]
   
 
Rule of thumb: use Pure Parallelism if you can, Concurrency otherwise.
 
Rule of thumb: use Pure Parallelism if you can, Concurrency otherwise.
   
=== Starting points ===
+
== Starting points ==
 
* '''Control.Parallel'''. The first thing to start with parallel programming in Haskell is the use of par/pseq from the parallel library.
 
   
* '''Nested Data Parallelism'''. For an approach to exploiting the implicit parallelism in array programs for multiprocessors, see [[GHC/Data Parallel Haskell|Data Parallel Haskell]] (work in progress).
+
* '''Control.Parallel'''. The first thing to start with parallel programming in Haskell is the use of par/pseq from the parallel library. Try the Real World Haskell [http://book.realworldhaskell.org/read/concurrent-and-multicore-programming.html chapter on parallelism and concurrency]. The parallelism-specific parts are in the second half of the chapter.
  +
* If you need more control, try Strategies or perhaps the Par monad
   
=== Multicore GHC ===
+
== Multicore GHC ==
   
 
{{GHC/Multicore}}
 
{{GHC/Multicore}}
   
=== Alternative approaches ===
+
== Alternative approaches ==
   
 
* Nested data parallelism: a parallel programming model based on bulk data parallelism, in the form of the [http://www.haskell.org/haskellwiki/GHC/Data_Parallel_Haskell DPH] and [http://hackage.haskell.org/package/repa Repa] libraries for transparently parallel arrays.
 
* Nested data parallelism: a parallel programming model based on bulk data parallelism, in the form of the [http://www.haskell.org/haskellwiki/GHC/Data_Parallel_Haskell DPH] and [http://hackage.haskell.org/package/repa Repa] libraries for transparently parallel arrays.
 
* Intel [http://software.intel.com/en-us/blogs/2010/05/27/announcing-intel-concurrent-collections-for-haskell-01/ Concurrent Collections for Haskell]: a graph-oriented parallel programming model.
 
* Intel [http://software.intel.com/en-us/blogs/2010/05/27/announcing-intel-concurrent-collections-for-haskell-01/ Concurrent Collections for Haskell]: a graph-oriented parallel programming model.
   
=== Related work ===
+
== See also ==
   
* [[Parallel]] portal
+
* The [[Parallel|parallelism and concurrency portal]]
  +
* Parallel [[Parallel/Reading|reading list]]
 
* [[Parallel/Research|Ongoing research in Parallel Haskell]]
 
* [[Parallel/Research|Ongoing research in Parallel Haskell]]
* The Sun project to improve http://ghcsparc.blogspot.com/ GHC performance on Sparc]
 
* A [http://www.well-typed.com/blog/38 Microsoft project to improve industrial applications of GHC parallelism].
 

Revision as of 14:25, 20 April 2011

Parallelism is about speeding up a program by using multiple processors.

In Haskell we provide two ways to achieve parallelism:

  • Pure parallelism, which can be used to speed up pure (non-IO) parts of the program.
  • Concurrency, which can be used for parallelising IO.

Pure Parallelism (Control.Parallel): Speeding up a pure computation using multiple processors. Pure parallelism has these advantages:

Concurrency (Control.Concurrent): Multiple threads of control that execute "at the same time".

  • Threads are in the IO monad
  • IO operations from multiple threads are interleaved non-deterministically
  • communication between threads must be explicitly programmed
  • Threads may execute on multiple processors simultaneously
  • Dangers: race conditions and deadlocks

Rule of thumb: use Pure Parallelism if you can, Concurrency otherwise.

Contents

1 Starting points

  • Control.Parallel. The first thing to start with parallel programming in Haskell is the use of par/pseq from the parallel library. Try the Real World Haskell chapter on parallelism and concurrency. The parallelism-specific parts are in the second half of the chapter.
  • If you need more control, try Strategies or perhaps the Par monad

2 Multicore GHC

Since 2004, GHC supports running programs in parallel on an SMP or multi-core machine. How to do it:

  • Compile your program using the -threaded switch.
  • Run the program with +RTS -N2 to use 2 threads, for example (RTS stands for runtime system; see the GHC users' guide). You should use a -N value equal to the number of CPU cores on your machine (not including Hyper-threading cores). As of GHC v6.12, you can leave off the number of cores and all available cores will be used (you still need to pass -N however, like so: +RTS -N).
  • Concurrent threads (forkIO) will run in parallel, and you can also use the par combinator and Strategies from the Control.Parallel.Strategies module to create parallelism.
  • Use +RTS -sstderr for timing stats.
  • To debug parallel program performance, use ThreadScope.

3 Alternative approaches

  • Nested data parallelism: a parallel programming model based on bulk data parallelism, in the form of the DPH and Repa libraries for transparently parallel arrays.
  • Intel Concurrent Collections for Haskell: a graph-oriented parallel programming model.

4 See also