Personal tools

IO Semantics

From HaskellWiki

(Difference between revisions)
Jump to: navigation, search
m
 
(8 intermediate revisions by 2 users not shown)
Line 1: Line 1:
== Semantics of IO: A Continuation Approach ==
+
[[Category:Theoretical_foundations]]
  +
== Semantics of IO: A Free Approach ==
   
 
The following is inspired by [http://luqui.org/blog/archives/2008/03/29/io-monad-the-continuation-presentation/ Luke Palmer's post]. This only describes one possible semantics of <hask>IO a</hask>; your actually implementation may vary.
 
The following is inspired by [http://luqui.org/blog/archives/2008/03/29/io-monad-the-continuation-presentation/ Luke Palmer's post]. This only describes one possible semantics of <hask>IO a</hask>; your actually implementation may vary.
   
The idea to to define <hask>IO</hask> as
+
The idea is to define <hask>IO</hask> as
 
<haskell>
 
<haskell>
newtype IO a = IO {unIO :: (a -> IOTree) -> IOTree}
+
data IO a = Done a
  +
| PutChar Char (IO a)
  +
| GetChar (Char -> IO a)
 
</haskell>
 
</haskell>
This is equivalent to defining <hask>IO</hask> as <hask>Cont IOTree</hask> from the [[monad template library]]. The monad functions for <hask>IO</hask> are derived from the monad functions for <hask>Cont</hask>.
+
  +
For simplicity this an example of <hask>IO</hask> that only gives semantics for teletype IO.
  +
  +
Think of <hask>IO a</hask> as a tree whose leaves are <hask>Done a</hask> that holds the result of the program. <hask>PutChar</hask> is a node that has one child tree and the node holds one character of data. <hask>GetChar</hask> is a node that has many children; it has one child for every <hask>Char</hask>, but <hask>GetChar</hask> holds no data itself.
  +
  +
This tree contains all the information needed to execute teletype interactions.
  +
One interprets (or executes) an <hask>IO a</hask> by tracing a route from root of the tree to a leaf.
  +
  +
If a <hask>PutChar</hask> node is encountered, the character data contained at that node is output to the terminal and then its subtree is executed. It is at this point that Haskell code evaluated in order to determine what character should be displayed before continuing. If a <hask>GetChar</hask> node is encountered, a character is read from the terminal (blocking if necessary) and the subtree corresponding to the character received is executed. If <hask>Done</hask> is encountered the program ends. <hask>Done</hask> holds the result of the computation, but in the case of <hask>main :: IO ()</hask> the data is of type <hask>()</hask> and thus contains no information and is ignored.
  +
  +
This execution is not done anywhere in a haskell program, rather it is done by the run-time system.
  +
  +
The monadic operations are defined as follows:
  +
 
<haskell>
 
<haskell>
return x = IO (\k -> k x)
+
return :: a -> IO a
x >>= f = IO (\k -> unIO x (\a -> unIO (f a) (\b -> k b)))
+
return x = Done x
</haskell>
+
<hask>IOTree</hask> is the ultimate result of a program. For simplicity we will give an example of <hask>IOTree</hask> that gives semantics for teletype IO.
+
(>>=) :: IO a -> (a -> IO b) -> IO b
<haskell>
+
Done x >>= f = f x
data IOTree = Done
+
PutChar c x >>= f = PutChar c (x >>= f)
| PutChar Char IOTree
+
GetChar g >>= f = GetChar (\c -> g c >>= f)
| GetChar (Char -> IOTree)
 
 
</haskell>
 
</haskell>
(This is a tree because the <hask>GetChar</hask> node has one subtree for every character)
 
   
<hask>IOTree</hask> contains all the information needed to execute teletype interactions.
+
As you can see <hask>return</hask> is just another name for <hask>Done</hask>. The bind operation takes a tree <hask>x</hask> and a function <hask>f</hask> and replaces the <hask>Done</hask> nodes (the leaves) of <hask>x</hask> by a new tree produce by applying <hask>f</hask> to the data held in the <hask>Done</hask> nodes.
One interprets (or executes) an <hask>IOTree</hask> by tracing a route from root of the tree to a leaf.
 
   
If a <hask>PutChar</hask> node is encountered, the character data contained at that node is output to the terminal and then its subtree is executed. It is only at this point that Haskell code is ever necessarily evaluated in order to determine what character should be displayed before continuing. If a <hask>GetChar</hask> node is encountered, a character is read from the terminal (blocking if necessary) and the subtree corresponding to the character received is executed. If <hask>Done</hask> is encountered the program ends.
+
The primitive IO commands are defined using these constructors.
The primitive IO commands are defined using these constructors
 
 
<haskell>
 
<haskell>
 
putChar :: Char -> IO ()
 
putChar :: Char -> IO ()
putChar x = IO (\k -> PutChar x (k ()))
+
putChar x = PutChar x (Done ())
   
 
getChar :: IO Char
 
getChar :: IO Char
getChar = IO (\k -> GetChar (\x -> k x))
+
getChar = GetChar (\c -> Done c)
 
</haskell>
 
</haskell>
  +
  +
The function <hask>putChar</hask> builds a small <hask>IO ()</hask> tree that contains one <hask>PutChar</hask> node holding the character data followed by <hask>Done</hask>.
  +
  +
The function <hask>getChar</hask> builds a short <hask>IO Char</hask> tree that begins with a <hask>GetChar</hask> that holds one <hask>Done</hask> node holding every character.
  +
 
Other teletype commands can be defined in terms of these primitives
 
Other teletype commands can be defined in terms of these primitives
 
<haskell>
 
<haskell>
Line 34: Line 38:
 
putStr = mapM_ putChar
 
putStr = mapM_ putChar
 
</haskell>
 
</haskell>
More generally speaking, <hask>IOTree</hask> will represent the desired interaction with the operating system. For every system call there will be a corresponding constructor in <hask>IOTree</hask> of the form
 
<haskell>
 
| SysCallName p1 p2 ... pn (r -> IOTree)
 
</haskell>
 
where <hask>p1</hask> ... <hask>pn</hask> are the parameters for the system call, and <hask>r</hask> is the result of the system call. (Thus <hask>PutChar</hask> and <hask>GetChar</hask> will not occur as constructors of <hask>IOTree</hask> if they don't correspond to system calls)
 
   
We said that the ultimate result of a program is an <hask>IOTree</hask>, however the main function has type <hask>IO ()</hask>. This is isomorphic to <hask>(() -> IOTree) -> IOTree</hask>, or equivalently <hask>IOTree -> IOTree</hask> which is not right.
+
More generally speaking, <hask>IO a</hask> will represent the desired interaction with the operating system. For every system call there will be a corresponding constructor in <hask>IOTree</hask> of the form
 
The simple solution to this is that the runtime system produces an <hask>IOTree</hask> from main by evaluating <hask>unIO main (\() -> Done) :: IOTree</hask>. Here <hask>\() -> Done</hask> represents the "rest of the program", which in this case is nothing.
 
 
The sophisticated solution to this problem is that <hask>main</hask> is passed to the operating system which will bind the next program (perhaps a shell) to <hask>main</hask>. Thus the semantics of our Haskell program becomes embedded into the semantics of the entire operating system run.
 
 
The type for <hask>IO a</hask> that we have given contains invalid programs such as
 
 
<haskell>
 
<haskell>
IO (\k -> filterTree (not . isPutChar) k ()) :: IO ()
+
| SysCallName p1 p2 ... pn (r -> IO a)
 
</haskell>
 
</haskell>
which would remove the output of any future <hask>putChar</hask> commands. However, none of these illegal programs can be generated from the monadic interface and the primitive operations provided.
+
where <hask>p1</hask> ... <hask>pn</hask> are the parameters for the system call, and <hask>r</hask> is the result of the system call. (Thus <hask>PutChar</hask> and <hask>GetChar</hask> will not occur as constructors of <hask>IOTree</hask> if they don't correspond to system calls)

Latest revision as of 21:57, 8 May 2010

[edit] Semantics of IO: A Free Approach

The following is inspired by Luke Palmer's post. This only describes one possible semantics of
IO a
; your actually implementation may vary. The idea is to define
IO
as
data IO a = Done a
          | PutChar Char (IO a)
          | GetChar (Char -> IO a)
For simplicity this an example of
IO
that only gives semantics for teletype IO. Think of
IO a
as a tree whose leaves are
Done a
that holds the result of the program.
PutChar
is a node that has one child tree and the node holds one character of data.
GetChar
is a node that has many children; it has one child for every
Char
, but
GetChar
holds no data itself.

This tree contains all the information needed to execute teletype interactions.

One interprets (or executes) an
IO a
by tracing a route from root of the tree to a leaf. If a
PutChar
node is encountered, the character data contained at that node is output to the terminal and then its subtree is executed. It is at this point that Haskell code evaluated in order to determine what character should be displayed before continuing. If a
GetChar
node is encountered, a character is read from the terminal (blocking if necessary) and the subtree corresponding to the character received is executed. If
Done
is encountered the program ends.
Done
holds the result of the computation, but in the case of
main :: IO ()
the data is of type
()
and thus contains no information and is ignored.

This execution is not done anywhere in a haskell program, rather it is done by the run-time system.

The monadic operations are defined as follows:

return :: a -> IO a
return x = Done x
 
(>>=) :: IO a -> (a -> IO b) -> IO b
Done x >>= f = f x
PutChar c x >>= f = PutChar c (x >>= f)
GetChar g >>= f = GetChar (\c -> g c >>= f)
As you can see
return
is just another name for
Done
. The bind operation takes a tree
x
and a function
f
and replaces the
Done
nodes (the leaves) of
x
by a new tree produce by applying
f
to the data held in the
Done
nodes.

The primitive IO commands are defined using these constructors.

putChar :: Char -> IO ()
putChar x = PutChar x (Done ())
 
getChar :: IO Char
getChar = GetChar (\c -> Done c)
The function
putChar
builds a small
IO ()
tree that contains one
PutChar
node holding the character data followed by
Done
. The function
getChar
builds a short
IO Char
tree that begins with a
GetChar
that holds one
Done
node holding every character.

Other teletype commands can be defined in terms of these primitives

putStr :: String -> IO ()
putStr = mapM_ putChar
More generally speaking,
IO a
will represent the desired interaction with the operating system. For every system call there will be a corresponding constructor in
IOTree
of the form
	| SysCallName p1 p2 ... pn (r -> IO a)
where
p1
...
pn
are the parameters for the system call, and
r
is the result of the system call. (Thus
PutChar
and
GetChar
will not occur as constructors of
IOTree
if they don't correspond to system calls)