[Haskell-cafe] Sequencing Operations in a Monad

Sat Sep 15 01:37:26 EDT 2007

SevenThunders wrote:
> I have a matrix library written in C and interfaced into Haskell
> with a lot of additional Haskell support.

[snip]

> Unfortunately if I wrap my matrix references in the IO monad, then
> at best computations like S = A + B are themselves IO computations
> and thus whenever they are 'invoked' the computation ends up getting
> performed repeatedly contrary to my intentions.

Here's some thoughts:

First, the IO monad already does sequencing, and it already has the
ability to execute an action once only.

Let's look at an example:

> test1 = do
>   putStr "What is your name? "
>   n <- getLine
>   putStrLn $ "Hello, " ++ n ++ "!"
>   return n

> getName :: IO String -> IO String
> getName nameAction = do
>   n <- nameAction        -- execute the action
>   return n

> getNameLength :: IO String -> IO Int
> getNameLength nameAction = do
>   n <- nameAction        -- execute the action
>   return $ length n

> test2 = do
>   let nameAction = test1 in do
>     n <- getName nameAction
>     putStrLn $ "Name = " ++ n
>     len <- getNameLength nameAction
>     putStrLn $ "Length = " ++ show len

> test3 = do
>   n <- test1
>   putStrLn $ "Name = " ++ n
>   putStrLn $ "Length = " ++ show (length n)

> test4 = do
>   let nameAction = test1 in do
>     n <- nameAction
>     n' <- getName (return n)
>     putStrLn $ "Name = " ++ n'
>     len <- getNameLength (return n)
>     putStrLn $ "Length = " ++ show len

GHCi> test1
What is your name? Ron
Hello, Ron!
"Ron"

GHCi> test2
What is your name? Alice
Hello, Alice!
Name = Alice
What is your name? Bob
Hello, Bob!
Length = 3

GHCi> test3
What is your name? Ron
Hello, Ron!
Name = Ron
Length = 3

GHCi> test4
What is your name? Ron
Hello, Ron!
Name = Ron
Length = 3

Notice that in test2, I am asked for my name twice.  This behavior is
expected because the functions "GetName" and "getNameLength" each
accept an action and execute it to get a name.

In test3, I am only asked for my name once.  I only want to execute
the action once, so I have to code it that way.

Before I explain test4, let's look at your example code:

> let S = A += B in
>   do
>       (r,c) <-  size S
>       k  <- matindex S

If S is being executed twice, then clearly S is an action.  Perhaps
the type of S is "IO MatrixIO" ?  If that's true, then presumably the
functions "size" and "matindex" have signatures:

 size     :: IO MatrixIO -> IO (Int, Int)
 matindex :: IO MatrixIO -> IO Int

Each function takes an IO action as its first argument, executes that
action, and then computes a result.

My two functions "getName" and "getNameLength" are similar to "size"
and "matindex": each function takes an IO action, executes the action,
and computes a result.

Now, look at test4.  That's how I can work around the behaviour of
"getName" and "getNameLength" while ensuring that I am only asked for
my name one time.  This works because "return" creates an IO action
that does nothing and simply returns its argument.

I could translate your example to the following:

> let S = A += B in
>   do
>       s <- S
>       (r,c) <-  size (return s)
>       k  <- matindex (return s)

This should only perform action S one time.

In fact, functions like "getNameLength" are poorly designed functions
because they fail on "Separation of concerns".  The "getNameLength"
function is doing two different things: (1) it executes an IO action
to get a name, and then (2) it computes and returns the name's length.
In test4, I am bypassing the execution of an IO action by passing the
non-action "return n" to the getNameLength function.

A simple design rule would be: A function should not take an IO action
as an input if that action is to executed exactly once and only once.

Let's move on to chained binary operations.

> If you arrange the types to try to do all the operations inside the
> IO monad you can't chain together more than 1 binary operation.

Using your example, suppose I want to compute S := Q * (A + B), but I
don't have a function that computes A + B.  Instead, what I have is a
function that computes A += B by modifying A in place.

If I want to compute S, and I don't care about preserving A, then I
would perform the following steps:
 A += B;  S := Q * A

If I do want to preserve A, then I need to copy it first.
 A' := copy A; A' += B; S := Q * A'

No matter what, I cannot escape the need to explicitly sequence the
operations.

In C++, I could play some very sophisticated games with templates and
operator overloading to coax the C++ compiler to accept an expression
with chained operations like "S = Q * (A + B)" and do the right
thing.  In Haskell, I'm pretty sure the corresponding techniques
involve using arrows.

If you don't want that level of sophistication, then you are best off
coding what you mean, as in:

 do
     -- compute S := Q * (A + B)
     C <- A + B
     S <- Q * C

Now, there's just one more thing [emphasis added].

> Moreover I manage my matrices on a stack in C, since it makes it
> easy to handle memory allocation and deallocation.  *The stack*
> *configuration tends to be highly fluid so there are always side*
> *effects going on.*  Right now my Matrix type wraps the index from the
> bottom of the Matrix stack into the IO monad.

This brings us back to sequencing.  Your "highly fluid" stack
manipulations in C are exactly the thing that's bug-prone and anathema
to functional programming.  Here's what I think you should do:

1. Write a bunch of safe wrappers, as Ryan has described.

2. Test them thoroughly, and also *prove* that they are in fact safe.

3. Write your application-specific code and test it.  Do your best to
  get your application-specific matrix calculations correct.

4. If the code is too slow then profile it.  Remember, 80% of the time
  is usually spent in 20% of the code.  *IF* (and only if) the matrix
  code happens to take up that 80% of the time, then proceed.

5. You can move your application-specific matrix calculations from
  Haskell to C and put them behind an FFI interface.  Then, working
  in C, you can optimize out all the matrix copying that takes place
  behind safe wrappers.  The calculations will run faster without the
  overhead of Haskell.

-- Ron