[Haskell-cafe] Lazy Lists and IO - Redux

Fri Jul 13 14:38:06 EDT 2007

[Ronald Guida, 07/11/07]
 > Suppose I have a function "f" that reads a lazy list, such that "f"
 > only consumes as much of the list as it needs to.  Laziness allows me
 > to apply "f" to an infinite list without creating an infinite loop.
 >
 > Now I want to connect the console to "f", such that the list of inputs
 > to "f" comes from the console, one item at a time.
 >

How do I do this?

[Stefan O'Rear]
 > Not very nicely.

Apparently, the solution gets ugly.

[Stefan O'Rear]
 > Option 1. Ignore purity [using unsafeInterleaveIO]
 > Option 2. Ignore lists

[Felipe Almeida Lessa]
 > Option 3. Use continuations

I would like to understand *why* it gets ugly, and I think I figured it
out.

[Ronald Guida, 07/11/07]
 > To create a specific example, lets suppose I want to read and
 > accumulate the contents of a list, and I want to stop when the sum
 > reaches or exceeds a specified cutoff.
 >
 > I can write this as a function that reads elements from an infinite list:
 >
 > [Snipped: Ronald Guida's newb implementation of accumUntilCutoff]

[Stefan O'Rear, 07/11/07, Improved implementation]
 >
 > accumUntilCutoff :: (Ord a, Num a) => a -> [a] -> (a, [a])
 > accumUntilCutoff cutoff xs =
 >     findAcceptLast ((>= cutoff) . fst) (zip (scanl (+) 0 xs) (inits xs))
 >
 > findAcceptLast :: (a -> Bool) -> [a] -> a
 > findAcceptLast pred lst = fromMaybe (last lst) (find pred lst)
 >

First, if I start with an arbitrary pure function, then I can build a
dependency graph to determine what to evaluate.  Since pure functions
are referentially transparent, I am free to evaluate the nodes of my
dependency graph in any order, provided that I respect the
dependencies.

On the one hand, suppose I want to read a list with IO.  In order to
use IO, or any monad for that matter, I have to pass a "baton"[1] from
one operation to the next.  If a create a complicated function that
involves a monad, then every time I use the "bind" operator, I am
adding an edge to my dependency graph.  Since I receive a baton from
the outside, and I have to return it, I end up threading that baton
through my dependency graph.  Now when my function is evaluated, the
evaluation order, at least for monadic actions, is locked down.

On the other hand, I can compare a lazy list function, such as
accumUntilCutoff, to a multi-layer perceptron[2].  The input layer of
this perceptron receives the contents of a lazy list.  List processing
functions, such as init, zipWith, and map, construct hidden layers of
neurons.  For example, "zipWith (+) xs $ tail xs" builds a hidden
layer such that each neuron computes the sum of two adjacent inputs.

The major contrast between a lazy list function and a multi-layer
perceptron is that for some functions, such as filter, takeWhile,
dropWhile, and find, I can't build the corresponding neural
interconnections until runtime, since these connections depend on the
actual *data* that is presented to the inputs.

A complicated function, like accumUntilCutoff, is "almost" a
multi-layer perceptron, except for the fact that parts of the
dependency graph are constructed at runtime based on the input data.
This makes it very hard to thread a baton through the dependency
graph.

So I want to make accumUntilCutoff read its input, lazily, from the
console.  That means:

 1. I need to provide a way to hand the IO baton to accumUntilCutoff
    and get it back at the end.

 2. The baton must be passed, sequentially, to each element of the
    input list, up to and including the last element that I need.

Here is my key observation -- Suppose that:
 1. I have two functions f and g that both process a lazy list.
 2. I feed the same "lazy list from the console" to both functions.
 3. Function f consumes part of the list, and g consumes more than f.
 4. I choose to print the result of f, then interact with the user,
    and later, based on user input, possibly print the result of g.

Then:
 1. In order to print the result of f, I must pass the baton through f,
    so the baton will be sequenced through a prefix of my lazy list.
 2. In order to determine whether to evaluate g, I must get the baton
    back from f and use it to interact with the user.
 3. If I later need to print the result of g, then I need to pass the
    baton through g, and the baton must be sequenced *starting in the
    middle* through my lazy list of user input.
 As a result, I have to interleave IO operations.

Example pseudo-code:

1.  main :: IO
2.  main = do
3.     putStrLn "Hello."
4.     xs <- getLazyListOfNumbersFromUser
5.     let ys = zipWith (+) xs $ tail xs
6.     print (ys !! 0)        -- depends on xs !! 0 and 1
7.     b <- askUserBool "Would you like to continue? "
8.     if b
9.       then print (ys !! 1) -- depends on xs !! 1 and 2
10.      else return ()
11.    putStrLn "Goodbye."

We *must* ask the user for the first two elements of xs because we
have to print the result before asking the user a question.  We
*can't* ask the user for the third element of xs at this time because
the user gets to decide whether we need it.  As a result, we must
interleave the operation of asking the user for elements of the list
"xs" with the operation of asking the user whether to continue.

 > Now I want to connect the console to "f", such that the list of inputs
 > to "f" comes from the console, one item at a time.

In order to do this, I must either rewrite "f" to pass a baton,

 > Option 2. Ignore lists
 > Option 3. Use continuations

or else I have to violate purity.

 > Option 1. Ignore purity [using unsafeInterleaveIO]

In conclusion:

 > I am dis-satisfied with the fact that I have to embed IO code in the
 > middle of accumulation code.

Then I can either (a) swallow it, (b) use unsafeInterleaveIO, or (c)
in the special case that "f" happens to access its input sequentially
anyway, rewrite it to pass a baton.

... newb code ...
 > readUntilCutoff :: Integer -> IO (Integer, [Integer])
 > readUntilCutoff cutoff = readHelper cutoff 0 id
 >     where
 >       readHelper :: {- cutoff          -} Integer ->
 >                     {- sum so far      -} Integer ->
 >                     {- elements so far -} ([Integer] -> [Integer]) ->
 >                     {- outputs         -} IO (Integer, [Integer])
 >       readHelper cutoff sum elems = do
 >         if sum < cutoff
 >           then do
 >             putStr "Enter an integer: "
 >             ln <- getLine
 >             let x = read ln
 >             readHelper cutoff (sum + x) (elems . (x:))
 >           else return $ (sum, elems [])

-- Ron

References:

[1] 
http://www.haskell.org/haskellwiki/IO_inside#Welcome_to_the_RealWorld.2C_baby_:.29

[2] http://en.wikipedia.org/wiki/Artificial_neural_network