Difference between revisions of "AtomicMemoryOps"

From HaskellWiki
Jump to navigation Jump to search
(Add new section on Atomic Memory Ops)
 
m (typos)
 
Line 12: Line 12:
 
When you begin using '''atomic-primops''', you will find that operations such as fetch-and-add within a byte-array, or a <hask>storeLoadBarrier</hask>, do exactly the same thing as they would in [http://gcc.gnu.org/onlinedocs/gcc-4.4.3/gcc/Atomic-Builtins.html C] or [http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/atomic/package-summary.html Java] or [http://llvm.org/docs/Atomics.html LLVM].
 
When you begin using '''atomic-primops''', you will find that operations such as fetch-and-add within a byte-array, or a <hask>storeLoadBarrier</hask>, do exactly the same thing as they would in [http://gcc.gnu.org/onlinedocs/gcc-4.4.3/gcc/Atomic-Builtins.html C] or [http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/atomic/package-summary.html Java] or [http://llvm.org/docs/Atomics.html LLVM].
   
Compare-and-swap, on the other hand, is a delicate business when applied to arbitrary Haskell values. Compare-and-swap on boxed values (pointers) depends on a stable notion of pointer equality, which exists in languages like Java, ML, or Scheme, but which doesn't exist in Haskell. In fact, the GHC implementation ''does'' provide a primop to test for pointer equality, but it's name tells the story: [http://www.haskell.org/ghc/docs/7.6.3/html/libraries/ghc-prim-0.3.0.0/GHC-Prim.html#g:20 reallyUnsafePtrEquality#].
+
Compare-and-swap, on the other hand, is a delicate business when applied to arbitrary Haskell values. Compare-and-swap on boxed values (pointers) depends on a stable notion of pointer equality, which exists in languages like Java, ML, or Scheme, but which doesn't exist in Haskell. In fact, the GHC implementation ''does'' provide a primop to test for pointer equality, but its name tells the story: [http://www.haskell.org/ghc/docs/7.6.3/html/libraries/ghc-prim-0.3.0.0/GHC-Prim.html#g:20 reallyUnsafePtrEquality#].
   
 
Haskell preserves referential transparency, and both Haskell programmers and the GHC compiler exercise equational reasoning that requires replacing terms with equal terms, using a notion of equality that does ''not'' respect pointer equality. Here are some things that GHC might do which can change the pointer that you ''think'' you have:
 
Haskell preserves referential transparency, and both Haskell programmers and the GHC compiler exercise equational reasoning that requires replacing terms with equal terms, using a notion of equality that does ''not'' respect pointer equality. Here are some things that GHC might do which can change the pointer that you ''think'' you have:
Line 18: Line 18:
 
* Unbox and rebox an <hask>Int</hask> value.
 
* Unbox and rebox an <hask>Int</hask> value.
 
* Wrap extra thunks around your values in various places, e.g. for profiling.
 
* Wrap extra thunks around your values in various places, e.g. for profiling.
* Duplicate immutable objects on the heap during garbage collection.
+
* Duplicate immutable objects in the heap during garbage collection.
* Substitute a different variable for the one you thought you used even though at the source level they have different types, and one is produced by an <hask>unsafeCoerce</hask>. (For example, see [https://github.com/rrnewton/haskell-lockfree/issues/5 this bug].)
+
* Substitute a different variable for the one you thought you used even though at the source level they have different types, and one is produced by an <hask>unsafeCoerce</hask> of the other. (For example, see [https://github.com/rrnewton/haskell-lockfree/issues/5 this bug].)
   
 
More details about these problems can be found in a Haskell Implementor's workshop [http://www.youtube.com/watch?v=xiPrpWQN7m0 talk from 2012].
 
More details about these problems can be found in a Haskell Implementor's workshop [http://www.youtube.com/watch?v=xiPrpWQN7m0 talk from 2012].
Line 25: Line 25:
 
== One Solution: CAS with abstract tickets ==
 
== One Solution: CAS with abstract tickets ==
   
Normally, CAS would take an old and a new value (of type <hask>a</hask>), and if the mutable reference is still set to the old value, it is modified to point at the new one. The problem with that approach in Haskell is that it can be very hard to provide the old pointer with any reliability, due to the issues cited above.
+
Normally, <code>CAS(ref,old,new)</code> takes an old and a new value (of any type), and if the mutable reference is still set to the old value, it is modified to point at the new one. The problem with that approach in Haskell is that it can be very hard to provide the old pointer with any reliability, due to the issues cited above.
   
 
In Haskell, we want a CAS which operationally does the same thing, but we need a different interface. In particular, a CAS operation exposed by the [http://hackage.haskell.org/package/atomic-primops/docs/Data-Atomics.html Data.Atomics] module looks like this:
 
In Haskell, we want a CAS which operationally does the same thing, but we need a different interface. In particular, a CAS operation exposed by the [http://hackage.haskell.org/package/atomic-primops/docs/Data-Atomics.html Data.Atomics] module looks like this:
Line 35: Line 35:
   
   
The <hask>old</hask> argument is of type <hask>Ticket a</hask> rather than <hask>a</hask>. The <hask>Data.Atomics</hask> library takes care of making sure the Ticket refers to something that the GHC compiler cannot interfere with. (In GHC 7.6 it uses the <hask>Any</hask> kind.) This involves a delicate balance of type coercions, NOINLINE pragmas, and so on, but the user shouldn't need to worry about it. A Ticket is a proper value that can be stored in data structures, passed around, and be safely relied upon to provide a snapshot of the previous state of the location when it comes time to modify it again.
+
The <hask>old</hask> argument is of type <hask>Ticket a</hask> rather than <hask>a</hask>. The <hask>Data.Atomics</hask> library takes care of making sure the Ticket refers to something that the GHC compiler cannot interfere with. (In GHC 7.6 it uses the <hask>Any</hask> kind.) This involves a delicate balance of type coercions, NOINLINE pragmas, and so on, but the user shouldn't need to worry about it. A Ticket is a proper value that can be stored in data structures, passed around, and be safely relied upon to provide a snapshot of the previous state of the location when it comes time to modify it again. The Ticket returned from one successful CAS operation can be used in the next.
   
 
== Implementation notes ==
 
== Implementation notes ==
   
The original version of atomic-primops implemented all the relevant compiler
+
The original version of atomic-primops implemented all the relevant compiler primops as ''foreign'' primops. This also necessitated duplicating some C code from the GHC runtime system.
primops as ''foreign'' primops. This also necessitated duplicating some C code
 
from the GHC runtime system.
 
   
This is not ideal, and [https://ghc.haskell.org/trac/ghc/wiki/Status/GHC-7.8 in GHC 7.8] the primops in question were added directly to the compiler. But more may be needed in the future. Further, for now the primops remain [https://ghc.haskell.org/trac/ghc/wiki/Commentary/PrimOps out-of-line rather than inline-primops]. This is because the GHC native code generator does not yet know how to produce the relevant instructions on all supported architectures, rather, this is done in C code.
+
This is not ideal, and [https://ghc.haskell.org/trac/ghc/wiki/Status/GHC-7.8 in GHC 7.8] the primops in question were added directly to the compiler. But more may be needed in the future. Also, a remaining drawback is that all atomic primops remain [https://ghc.haskell.org/trac/ghc/wiki/Commentary/PrimOps out-of-line rather than inline-primops]. This is because the GHC native code generator does not yet know how to produce the relevant instructions on all supported architectures, rather, this is done in C code.
   
 
'''Future work''' may include adding direct support for these primops in the
 
'''Future work''' may include adding direct support for these primops in the

Latest revision as of 18:26, 21 November 2013

Purpose

Modern multicore processors provide a various atomic memory operations. These are instructions that work (i.e. are linearizable) even when used concurrently by multiple threads on the same memory address. They include compare-and-swap, atomic read-modify-write operations such as fetch-and-add, and others.

The main motivation for exposing these operations is to implement lock-free data structures in Haskell, and to perform efficient concurrent operations on existing data structures like IORef and Vector.

To get started quickly using these operations in Haskell, grab the atomic-primops package from Hackage. This is a low-level library that provides a thin layer over the basic operations: GHC primops or foreign primops. It does, as of version 0.4, provide some useful operations for end-users, including CAS on IORefs, and in the future we hope to add more functionality for modifying popular data structures such as Vectors.

Problems with CAS in Haskell

When you begin using atomic-primops, you will find that operations such as fetch-and-add within a byte-array, or a storeLoadBarrier, do exactly the same thing as they would in C or Java or LLVM.

Compare-and-swap, on the other hand, is a delicate business when applied to arbitrary Haskell values. Compare-and-swap on boxed values (pointers) depends on a stable notion of pointer equality, which exists in languages like Java, ML, or Scheme, but which doesn't exist in Haskell. In fact, the GHC implementation does provide a primop to test for pointer equality, but its name tells the story: reallyUnsafePtrEquality#.

Haskell preserves referential transparency, and both Haskell programmers and the GHC compiler exercise equational reasoning that requires replacing terms with equal terms, using a notion of equality that does not respect pointer equality. Here are some things that GHC might do which can change the pointer that you think you have:

  • Unbox and rebox an Int value.
  • Wrap extra thunks around your values in various places, e.g. for profiling.
  • Duplicate immutable objects in the heap during garbage collection.
  • Substitute a different variable for the one you thought you used even though at the source level they have different types, and one is produced by an unsafeCoerce of the other. (For example, see this bug.)

More details about these problems can be found in a Haskell Implementor's workshop talk from 2012.

One Solution: CAS with abstract tickets

Normally, CAS(ref,old,new) takes an old and a new value (of any type), and if the mutable reference is still set to the old value, it is modified to point at the new one. The problem with that approach in Haskell is that it can be very hard to provide the old pointer with any reliability, due to the issues cited above.

In Haskell, we want a CAS which operationally does the same thing, but we need a different interface. In particular, a CAS operation exposed by the Data.Atomics module looks like this:

casIORef :: IORef a -> Ticket a -> a -> IO (Bool, Ticket a) casIORef ref old new = ...


The old argument is of type Ticket a rather than a. The Data.Atomics library takes care of making sure the Ticket refers to something that the GHC compiler cannot interfere with. (In GHC 7.6 it uses the Any kind.) This involves a delicate balance of type coercions, NOINLINE pragmas, and so on, but the user shouldn't need to worry about it. A Ticket is a proper value that can be stored in data structures, passed around, and be safely relied upon to provide a snapshot of the previous state of the location when it comes time to modify it again. The Ticket returned from one successful CAS operation can be used in the next.

Implementation notes

The original version of atomic-primops implemented all the relevant compiler primops as foreign primops. This also necessitated duplicating some C code from the GHC runtime system.

This is not ideal, and in GHC 7.8 the primops in question were added directly to the compiler. But more may be needed in the future. Also, a remaining drawback is that all atomic primops remain out-of-line rather than inline-primops. This is because the GHC native code generator does not yet know how to produce the relevant instructions on all supported architectures, rather, this is done in C code.

Future work may include adding direct support for these primops in the LLVM backend, which supports all the operations in a straightforward way, and would not have to suffer the overhead of calling into C code.