Revision as of 17:07, 9 February 2007

throwTo & block statements fixed in GHC 6.6.1

The problem which used to be described here has been fixed in GHC HEAD and this will be included in GHC version 6.6.1

I have built (Feb 9, 2007) the HEAD version of GHC and this problem is no longer present.

See the ticket at [1] and the comments near the top of [2].

The description which was here has been deleted, but is still available in the historical views of this page.

Difference between revisions of "GHC/Concurrency/Flaws"

Revision as of 17:07, 9 February 2007

throwTo & block statements fixed in GHC 6.6.1

Navigation menu

Search

@@ Line 1: / Line 1: @@
+[[Category:GHC|Concurrency]]
-= "throwTo & block statements considered harmful" =
+= throwTo & block statements fixed in GHC 6.6.1 =
-The following description is by ChrisKuklewicz.
+The problem which used to be described here has been fixed in GHC HEAD and this will be included in GHC version 6.6.1
-This is a short essay to prove that the current GHC concurrency
-implementation has a critical flaw.
+I have built (Feb 9, 2007) the HEAD version of GHC and this problem is no longer present.
-The key problem is, at least in the presence of block/unblock, that
-Exceptions are never reliably delivered.  And this is not a
-theoretical case, but can cause a hang in something as innocuous as
-"program A" given below.
+See the ticket at [http://hackage.haskell.org/trac/ghc/ticket/1047] and the comments near the top of [http://darcs.haskell.org/ghc/rts/Exception.cmm].
-Simon Marlow wrote:
-<pre>
->> I think people are misunderstanding the nature of a safepoint. The
->> safepoint is a point at which you are prepared to have exceptions
->> delivered. This does not mean that they *will* be delivered, just that they
->> can. If you need to *wait* for an asynchronous exception, then you
->> shouldn't be using them at all.
->
-> Right.  If a thread mostly runs inside 'block' with the occasional safe
-> point, then your exceptions are not really asynchronous, they're synchronous.
->
->
-> In this case, I'd say a better solution is to have an explicit event queue,
-> and instead of the safe point take an event from the queue.  The action on
-> receiving an event can be to raise an exception, if necessary.
->
-> Cheers, Simon
-</pre>
+The description which was here has been deleted, but is still available in the historical views of this page.
-The implementation of asynchronous signals, as described by the paper
-"Asynchronous exceptions in Haskell by Simon Marlow, Simon Peyton Jones, Andy Moran and John Reppy, PLDI'01."
-is fatally inconsistent with the implementation in GHC 6.4 and GHC 6.6 today.
-The implemented semantics have strictly weaker guarantees and render programs
-using asynchronous expressions impossible to write correctly.  The semantics in
-the paper were carefully designed to solve the problem laid out in the first
-sentence of the abstract:
-"Asynchronous exceptions, such as timeouts, are important for robust, modular
-programs, but are extremely difficult to program with -- so much so that most
-programming languages either heavily restrict them or ban them altogether."
-And I believe the paper succeeded.  The paper shows how to replace other
-languages pervasive and intrusive error catching and handling code with much
-cleaner, clearer, and often more correct code.
-The implementation in GHC has changed the behavior of throwTo from asynchronous
-(not-interruptible) to synchronous (interruptible?) as discussed in section 8 of
-the paper.  This change, in and of itself, is not the fatal problem; as
-described in the paper a (forkIO (throwTo ...)) recovers the asynchronous behavior.
-The fatal change between the paper and GHC comes from not following section 7.2
-as published.  Section "7.2 Implementation of throwTo" has two bullet point, and
-the second bullet point is (retyped, so typos are my own fault):
-"As soon as a thread exits the scope of a 'block', and at regular intervals
-during execution inside 'unblock', it should check its queue of pending
-exceptions.  If the queue is non-empty, the first exception from the queue
-should be raised."
-A test of GHC 6.6 shows that this is not the case.  Test program A:
-<haskell>
-> loop = block (print "alive") >> loop
->
-> main = do tid <- forkIO loop
->           threadDelay 1
->           killThread tid
-</haskell>
-Program A, compiled with (-threaded) on a single CPU machine never halts.  It
-will print "alive" forever while the the main thread is blocked on "killThread".
-As an aside, removing the threadDelay causes killThread to destroy the child
-before it can enter the block, thus showing the need to add "forkBlockedIO" or
-"forkInheritIO" to the library.  This can be worked around using an MVar.
-Changing the definition of loop produces Test program B:
-<haskell>
-> loop = block (print "alive") >> loop >> yield
->
-> main = do tid <- forkIO loop
->           threadDelay 1
->           killThread tid
-</haskell>
-This prints "alive" twice before the killThread succeeds.
-The paper demands that when the loop in Program A exits the scope of "block
-(print a)" that it check a queue of pending exceptions, see that it is
-non-empty, and raise the exception thrown by killThread.  This can also be seen
-in "Figure 5. Transition Rules for Asynchronous Exceptions", where killThread
-should use throwTo to create an in-flight exception and exiting the scope of
-block in the presence of this in-flight exception should raise the exception.
-The implementation in GHC sleeps the main thread at the killThread command, and
-it is awoken when the block is exited and to succeed in delivering the
-exception it must execute while the child is still in the unblocked state.  But
-the child re-enters a blocked state too quickly in Program A, so killThread
-never succeeds.  The change in Program B has the child "yield" when unblocked
-and this gives the main thread a change to succeed.
-This trick using yield to make a safepoint was suggested by Simon Marlow:
-<pre>
-> The window in 'unblock (return ())' is tiny, I'm not really surprised if
-> nothing ever gets through it.  You might have more luck with 'unblock yield'.
-</pre>
-It has been said on this mailing list thread that needing "yield" to
-program concurrently is a bug in the user's code and should be
-replaced by other mechanisms.  In a complex program with many threads
-even the yield trick may not be enough to program reliably.  Using
-<haskell>
-blockY io = do x <- block io
-               yield
-               return x
-unblockY io = unblock (yield >> io)
-</haskell>
-instead of 'block' and 'unblock' is at least a simple but unreliable workaround.
-This would not be a fatal flaw if there were a simple reliable
-workaround.  The best workaround is the following, which is anything
-but simple, and amount to re-implementing the mechanisms in the paper:
-  * When forking a thread:
-  ** Create an (Chan Exception) (or (TChan Exception)) and pass it to
-     the child thread.  Call this the queue.
-  ** Create a ((T)MVar queue) to indicate whether the thread is still
-     alive or not and wrap the whole child action in a 'finally' to
-     ensure this is set to the dead state when the child exits.
-     (Alive iff the queue is in the MVar)
-  ** To be sure the child thread is in the handler, use another MVar
-     or re-use the "alive/dead" MVar to ensure the child is running
-     before returning the ThreadId from the fork operation.
-  * Replace all uses throwTo/killThread by
-  ** Test the alive/dead state of the thread, proceeding only on
-     living threads
-  ** Putting the Exception in the queue
-  ** Call throwTo/killThread with the Exception
-  * The child thread must:
-  ** implement safepoints by checking the queue
-  ** At each 'block' to 'unblock' transition the child ought to add a
-     safepoint
-This machinery closely recovers the queue of exceptions and the
-semantics described in the paper.  When a thread has died the queue is
-removed from the alive/dead MVar and can be garbage collected, along
-with any pending exceptions for the dead thread.
-To ensure the above semantics are not violated by the caller, the now
-fork operation should not return an exposed ThreadId, but should
-return some proxy type or operation to ensure only the proper
-throwto/killThread are peformed.
-Next I will demonstrate how the paper's achievement of simple and
-clear error handling code is lost.
-The child thread is where the workaround becomes a nightmare.  It is
-impossible to test the current 'block' versus 'unblock' state.  So the
-only time the child is certain that a 'block' to 'unblock' transition
-has occurred is when it is explicit:
-<haskell>
-> unblock ( .... block ( ... ) <*> .... )
-</haskell>
-The <*> is obviously such a transition and ought to have a safepoint added.
-But is only because the block had the explicit unblock around it.  If
-there were a function such as
-<haskell>
-> atomicUpdateFoo = block (...) <?>
-</haskell>
-then there is no way to know if the <?> should be a safepoint.
-Therefore atomicUpdateFoo needs documentation so that the caller knows
-to add a safepoint in the case it is called from the unblocked case.
-In a more complex example:
-<haskell>
-> atomicUpdateFooBar = block (...) <1?> block (...) <2?>
-</haskell>
-the caller cannot add <1?> itself, and must therefore pass in code
-'safe :: IO ()' to all such functions:
-<haskell>
-> atomicUpdateFooBar safe = block (...) >> trans >> block (...) >>
-</haskell>
-trans where 'trans' is "return ()" or "checkChan queue" depending on
-whether the the caller knows it is in a 'block' or 'unblock' state.
-If the parent never needed to use 'throwTo'/'killThread' then the
-child never needs 'block'/'unblock' and the nightmare is avoided.  But
-if the child uses any operation which can block such as 'accept' or
-'getChar' or 'takeMVar' then the parent cannot deliver the signal
-without waking up the child with a 'throwTo' which quickly leads to
-either an overwhelming use of 'catch' in the child (like other
-languages) or use of 'block' and the fragile workaround I just
-described.
-As an aside, if there was a "isBlockedThreadState" operation, then it
-would be possible to create a better client workaround:
-<haskell>
-checkChan queue x = do
-  isBlocked <- isBlockedThreadState -- perhaps needing =<< myThreadId
-  case isBlocked of
-    True -> return x
-    False -> raiseQueue queue x
-raiseQueue queue x = do
-  hasException <- isEmptyChan queue
-  case hasException of
-    False -> return x
-    True -> readChan >>= raise
-checkedBlock queue io = do x <- block io
-                           checkChan queue x
-checkedUnblock queue io = checkChan queue >> io
-</haskell>
-If that existed then the client should use checkedBlock and
-checkedUnblock instead of block and unblock.  The net effect of this
-workaround is to implement the queue described in the original paper
-and wrap the forkIO/throwTo/block/unblock operations with new ones
-that implement the semantics described in the section 7.2 of the
-paper.
-An ugly side effect of the above is that the modified throwTo also
-sent the Exception via throwTo as well as the (T)Chan.  So it is
-possible that Exception could be raised twice depending on how the
-child is written.  And the order of exceptions in the channel may not
-be the order that the throwTo's deliver the asynchronous exceptions.
-To alleviate this somewhat you could have the modified throwTo put the
-real exception on the (T)Chan and send a GoLookInTheChan asynchronous
-exception.  Reversing this might be possible, but it is not clear how.
-If we could deterministicly wait for a pending asynchronous signal
-then we could write a proper safe point to fix "block to unblock"
-transitions, which we know we cannot do.
-We can try to implement isBlockedThread state by maintaining a stack
-of block/unblock operations:
-<haskell>
-checkedBlock stack queue io = do
- x <- bracket_ (modifyIORef stack (True:))
-               (modifyIORef stack tail)
-               (io)
- checkChan queue x
-checkedUnblock stack queue io = do
- checkChan queue ()
- x <- bracket_ (modifyIORef stack (False:))
-               (modifyIORef stack tail)
-               (io)
-isBlockedThread stack = liftM head (readIORef stack)
-</haskell>
-Where "stack :: IORef [Bool]" is created and passed to the child by
-the modified fork operation.  The 'stack' above grows without bound
-during recursion and is not tail recursive, so the stack management
-should be changed to the algorithm in the paper... in which case one
-really has recreated the paper's implementation inside Haskell.
-The only alternatives are to
-  * Write code that may hang forever at a throwTo/killThread
-  * Never use throwTo or killThread making block/unblock irrelevant
-  * Never use 'block' (and 'unblock')
-  * Only write children with explicit "(unblock ... block (... block
-    () ...) >>= raiseQueue queue >>= ...." where the different between
-    checked and unchecked block is obvious and explicit.
-  * Track and propagate the blocked state as in atomicUpdateFooBar.
-This problem with throwTo/block was not obvious until I stated writing
-small programs to test the implementation in GHC and thus discovered
-that the paper's description had mislead me.
-The key problem is, at least in the presence of block/unblock, that
-Exceptions are never reliably delivered.  And this is not a
-theoretical case, but can cause a hang in something as innocuous as
-"program A" given above.