ST and IO. Two problems. Some old some new?

Amr Sabry sabry@cs.indiana.edu
Thu, 07 Feb 2002 06:14:12 -0500


Background. We have two notions of state (ST and IO) with two
mediating operations:

unsafeIOtoST :: IO a -> ST s a
stToIO :: ST s a -> IO a

unsafeIOtoST is documented as unsafe because exceptions that would
have been caught in the IO monad are not caught in the ST monad but
this is irrelevant to the rest of the message. There are instead two
other problems that I'd like to discuss.

---------------------------------------------------------------------------
1. Sequencing. Consider:

m0 f = do x <- newIORef 10
          y <- return (f x)
          v <- readIORef x
          print v
                               
m1 f = do x <- newIORef 10
          y <- return (f x)
          v <- seq y $ readIORef x
          print v
                               
f1 x = runST (do v <- unsafeIOtoST $ readIORef x
                 unsafeIOtoST $ writeIORef x (v+v))

f2 x = runST (do v <- unsafeIOtoST $ readIORef x
                 seq v $ unsafeIOtoST $ writeIORef x (v+v))

f3 x = runST (unsafeIOtoST (do v <- readIORef x
                               writeIORef x (v+v)))
      
On Hugs (Dec. 2001): 
m0 f1 = 10
m0 f2 = 10
m0 f3 = 10 
m1 f1 = segmentation fault 
m1 f2 = 20
m1 f3 = 20

Is it possible to combine ST-state and IO-state in a reasonable way
without hacking with evaluation order?

---------------------------------------------------------------------------
2. Soundness of stToIO

A year ago there was a discussion about the soundness of
stToIO. Intuitively in the type of stToIO, the type variable s that is
critical for enforcing soundness of encapsulation is ignored. This
enables one to encapsulate regions even if they export local locations
as follows:

data Var a = Var { get :: () -> IO a, 
                   set :: a -> IO ()}

nloc :: Int -> Var Int
nloc a = 
  runST 
    (do x <- newSTRef a
        return 
          (Var {get = 
                  \() -> stToIO $ readSTRef x, 
                set = 
                  \a -> stToIO $ writeSTRef x a}))

Instead of returning the location directly (which would not
typecheck), one can instead return a pair of a getter and setter that
enable reading and writing of the location. 

Going one step further, we can encapsulate the operations for reading
and writing the locations and then write gensym of type () -> Int!

rloc :: Var Int -> Int
rloc (Var {get,set}) = 
  runST (do v <- unsafeIOtoST $ get () 
            seq v $ return v)

wloc :: Var Int -> Int -> ()
wloc (Var {get,set}) a = 
  runST (do s <- unsafeIOtoST $ set a
            seq s $ return s)

gensym :: () -> Int
gensym = let x = nloc 0
         in \() -> let y = rloc x
                       z = wloc x (y+1)
                   in seq y $ seq z $ y

test = (gensym(), gensym(), gensym()) 
-- evalutes to (0,1,2)

It is clear that this whole thing should be completely rethought. I am
working on a proposal but thought I'd get some comments from the
community. Thanks. --Amr