Difference between revisions of "Parallelism"

Latest revision as of 13:09, 24 December 2014

Parallelism is about speeding up a program by using multiple processors.

In Haskell we provide two ways to achieve parallelism:

Pure parallelism, which can be used to speed up non-IO parts of the program.
Concurrency, which can be used for parallelising IO.

Pure Parallelism (Control.Parallel): Speeding up a pure computation using multiple processors. Pure parallelism has these advantages:

Guaranteed deterministic (same result every time)
no race conditions or deadlocks

Concurrency (Control.Concurrent): Multiple threads of control that execute "at the same time".

Threads are in the IO monad
IO operations from multiple threads are interleaved non-deterministically
communication between threads must be explicitly programmed
Threads may execute on multiple processors simultaneously
Dangers: race conditions and deadlocks

Rule of thumb: use Pure Parallelism if you can, Concurrency otherwise.

Starting points

Control.Parallel. The first thing to start with parallel programming in Haskell is the use of par/pseq from the parallel library. Try the Real World Haskell chapter on parallelism and concurrency. The parallelism-specific parts are in the second half of the chapter.
If you need more control, try Strategies or perhaps the Par monad

Multicore GHC

Since 2004, GHC supports running programs in parallel on an SMP or multi-core machine. How to do it:

Download a recent GHC.

Compile your program using the -threaded switch.

Run the program with +RTS -N2 to use 2 threads, for example (RTS stands for runtime system; see the GHC users' guide). You should use a -N value equal to the number of CPU cores on your machine (not including Hyper-threading cores). As of GHC v6.12, you can leave off the number of cores and all available cores will be used (you still need to pass -N however, like so: +RTS -N).

Concurrent threads (forkIO) will run in parallel, and you can also use the par combinator and Strategies from the Control.Parallel.Strategies module to create parallelism.

Use +RTS -sstderr for timing stats.

To debug parallel program performance, use ThreadScope.

Alternative approaches

Nested data parallelism: a parallel programming model based on bulk data parallelism, in the form of the DPH and Repa libraries for transparently parallel arrays.
monad-par and LVish provide Par monads that can structure parallel computations over "monotonic" data structures, which in turn can be used from within purely functional programs.
[OLD] Intel Concurrent Collections for Haskell: a graph-oriented parallel programming model.

@@ Line 1: / Line 1: @@
-== Parallel Programming in Haskell ==
 Parallelism is about speeding up a program by using multiple processors.
 In Haskell we provide two ways to achieve parallelism:
-  - Concurrency, which can be used for parallelising IO.
+* Pure parallelism, which can be used to speed up non-IO parts of the program.
-  - Pure parallelism, which can be used to speed up pure (non-IO)
+* Concurrency, which can be used for parallelising IO.
-    parts of the program.
+Pure Parallelism (Control.Parallel): Speeding up a pure computation using multiple processors. Pure parallelism has these advantages:
-[[Concurrency]] (Control.Concurrent):
+* Guaranteed deterministic (same result every time)
-    Multiple threads of control that execute "at the same time".
+* no [http://en.wikipedia.org/wiki/Race_condition race conditions] or [http://en.wikipedia.org/wiki/Deadlock deadlocks]
-    - Threads are in the IO monad
-    - IO operations from multiple threads are interleaved
-      non-deterministically
-    - communication between threads must be explicitly programmed
-    - Threads may execute on multiple processors simultaneously
-    - dangers: [[race conditions]] and [[deadlocks]]
+[[Concurrency]] (Control.Concurrent): Multiple threads of control that execute "at the same time".
-Pure Parallelism (Control.Parallel):
+* Threads are in the IO monad
-    Speeding up a pure computation using multiple processors.
+* IO operations from multiple threads are interleaved non-deterministically
-    - Pure parallelism has these advantages:
+* communication between threads must be explicitly programmed
-      - guaranteed deterministic (same result every time)
+* Threads may execute on multiple processors simultaneously
-      - no [[race conditions]] or [[deadlocks]]
+* Dangers: [http://en.wikipedia.org/wiki/Race_condition race conditions] and [http://en.wikipedia.org/wiki/Deadlock deadlocks]
 Rule of thumb: use Pure Parallelism if you can, Concurrency otherwise.
-=== Starting points ===
+== Starting points ==
-* '''Control.Parallel'''.  The first thing to start with parallel programming in Haskell is the use of par/pseq from the parallel library.
+* '''Control.Parallel'''.  The first thing to start with parallel programming in Haskell is the use of par/pseq from the parallel library.  Try the Real World Haskell [http://book.realworldhaskell.org/read/concurrent-and-multicore-programming.html chapter on parallelism and concurrency].  The parallelism-specific parts are in the second half of the chapter.
-* '''Nested Data Parallelism'''.  For an approach to exploiting the implicit parallelism in array programs for multiprocessors, see [[GHC/Data Parallel Haskell|Data Parallel Haskell]] (work in progress).
+* If you need more control, try Strategies or perhaps the Par monad
-=== Multicore GHC ===
+== Multicore GHC ==
 {{GHC/Multicore}}
-=== Alternative approaches ===
+== Alternative approaches ==
 * Nested data parallelism: a parallel programming model based on bulk data parallelism, in the form of the [http://www.haskell.org/haskellwiki/GHC/Data_Parallel_Haskell  DPH] and [http://hackage.haskell.org/package/repa Repa] libraries for transparently parallel arrays.
+* [https://hackage.haskell.org/package/monad-par monad-par] and [https://hackage.haskell.org/package/lvish LVish] provide Par monads that can structure parallel computations over "monotonic" data structures, which in turn can be used from within purely functional programs.
-* Intel [http://software.intel.com/en-us/blogs/2010/05/27/announcing-intel-concurrent-collections-for-haskell-01/ Concurrent Collections for Haskell]: a graph-oriented parallel programming model.
+* [OLD] Intel [http://software.intel.com/en-us/blogs/2010/05/27/announcing-intel-concurrent-collections-for-haskell-01/ Concurrent Collections for Haskell]: a graph-oriented parallel programming model.
-=== Related work ===
+== See also ==
-* [[Parallel]] portal
+* The [[Parallel|parallelism and concurrency portal]]
+* Parallel [[Parallel/Reading|reading list]]
 * [[Parallel/Research|Ongoing research in Parallel Haskell]]
-* The Sun project to improve http://ghcsparc.blogspot.com/ GHC performance on Sparc]
-* A [http://www.well-typed.com/blog/38 Microsoft project to improve industrial applications of GHC parallelism].

Difference between revisions of "Parallelism"

Latest revision as of 13:09, 24 December 2014

Contents

Starting points

Multicore GHC

Alternative approaches

See also

Navigation menu

Search