[GHC] #7815: STM fails to validate read.

GHC cvs-ghc at haskell.org
Fri Apr 5 19:14:10 CEST 2013


#7815: STM fails to validate read.
----------------------------------------+-----------------------------------
Reporter:  fryguybob                    |          Owner:                
    Type:  bug                          |         Status:  new           
Priority:  normal                       |      Component:  Runtime System
 Version:  7.7                          |       Keywords:  STM           
      Os:  Unknown/Multiple             |   Architecture:  x86           
 Failure:  Incorrect result at runtime  |      Blockedby:                
Blocking:                               |        Related:                
----------------------------------------+-----------------------------------
 This issue was brought up by napping in {{{#haskell}}} with this paste:

 [http://hpaste.org/85134] (the first paste in particular)

 The code is:

 {{{
 import Control.Concurrent.STM
 import Control.Concurrent
 import Control.Monad

 main = do
   dog <- newTVarIO False
   cat <- newTVarIO False
   let unset = do
         d <- readTVar dog
         c <- readTVar cat
         if (d || c) then retry else return ()
       setDog = unset >> writeTVar dog True
       setCat = unset >> writeTVar cat True
       oops = do
         d <- readTVar dog
         c <- readTVar cat
         guard (d && c)
       reset = do
         writeTVar dog False
         writeTVar cat False
       reset' = do
         d <- readTVar dog
         c <- readTVar cat
         guard (d || c)
         reset

   forkIO (atomically oops >> putStrLn "Oh Noes!")
   forever (do
     forkIO (atomically setDog)
     forkIO (atomically setCat)
     atomically reset'
     atomically reset')
 }}}

 When run it produces:

 {{{
 $ ghc --make test.hs -threaded -rtsopts
 $ ./test +RTS -N2
 Oh Noes!
 test: thread blocked indefinitely in an STM transaction
 }}}

 The second message is just a consequence of entering an unexpected state.
 The first message indicates that both the transactions {{{cat}}} and
 {{{dog}}} committed at the same time.

 It does this for HEAD and 7.6.

 I've sketched out an interleaving that leads to this.  TRec entries
 are in the first and third column and TVar's are in the second column.
 Each entry has a TVar name and the expected value followed
 by the new value and then a number of updates if it has been read.  TVars
 list their value and their number of updates.

 {{{
   A TRec       TVar         B TRec
                                      -- Transactions start
 cat F F      cat F 0      cat F F    -- Initial reads.
 dog F F      dog F 0      dog F F

 cat F T                   dog F T    -- Local writes in TRec's

                                      -- Validation:

                           cat F F 0  -- B reads num_updates from cat (with
                                   ^  --   consistency check with value)
 cat F T    cat A 0                |  -- A acquires lock for cat (atomic
 cas)
 dog F F 0        ^                |  -- A reads num_updates from dog (with
         ^        |                |  --    consistency check with value)
         |      dog B 0    dog F T |  -- B acquires lock for dog (atomic
 cas)
         |        |   ^            |
         |        |   |            |
 Success 0        |   0            |  -- read check for A
 Success          0                0  -- read check for B

            cat A 1                   -- Increment version
            cat T 1                   -- Unlock with new value
                dog B 1               -- Increment version
                dog B T               -- Unlock with new value
 }}}

 What is clear here is that the version number is not enough to check
 in {{{check_read_only}}} because there is a gap between locking and
 incrementing the version.  We need to know atomically that the TVar is not
 locked and it's version number is the same.

 I need to read through the right parts of Keir Fraser's thesis carefully,
 but it seems like the read phase here is only helpful in preventing a
 commit that writes back the exact value we have already seen while we are
 in the middle of committing.

 The code for {{{check_read_only}}} is here:

 {{{
 #!c
 static StgBool check_read_only(StgTRecHeader *trec STG_UNUSED) {
   StgBool result = TRUE;

   ASSERT (config_use_read_phase);
   IF_STM_FG_LOCKS({
     FOR_EACH_ENTRY(trec, e, {
       StgTVar *s;
       s = e -> tvar;
       if (entry_is_read_only(e)) {
         TRACE("%p : check_read_only for TVar %p, saw %ld", trec, s, e ->
 num_updates);
         if (s -> num_updates != e -> num_updates) {
           // ||s -> current_value != e -> expected_value) {
           TRACE("%p : mismatch", trec);
           result = FALSE;
           BREAK_FOR_EACH;
         }
       }
     });
   });

   return result;
 }
 }}}

 If I restore the commented out line (which appears commented out in the
 first commit of this code that I can find) I can't reproduce the issue,
 but I think there is still a problem due to the ordering of
 those checks: we could observe the version as the same, while it
 is locked, have the TVar unlock, then observe the value the same.

 Switching the order we can only observe the TVar unlocked if the
 update has been incremented (as long as we are on an architecture
 that ensures this such as x86).

 Does this seem right?  One issue is that given the interleaving that above
 with this added check ''both'' transactions could fail to commit.  I think
 the algorithms from Fraser avoid this, but I think it always involves
 invalidating another transaction (i.e. killing off any other transactions
 observed to be in the read check phase with an overlapping TVar in a way
 that results in only one winner (see page 21 section 4.4 in ''Concurrent
 Programming Without Locks'')).

 As a side note, the issue can be avoided by ensuring that the reads become
 writes and avoiding the read only check.  But you can only
 become a write if the values do not have matching pointers, switching to
 Ints instead of Bools gets you there.

-- 
Ticket URL: <http://hackage.haskell.org/trac/ghc/ticket/7815>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler



More information about the ghc-tickets mailing list