[Haskell-cafe] IO and lazyness.

Jeremy Shaw jeremy at n-heptane.com
Tue Mar 6 18:57:48 EST 2007


At Tue, 6 Mar 2007 22:56:52 +0100,
D.V. wrote:
> 
> > The problem is that hGetContents only reads the contents of the file on demand
> > and, without the 'return $!' you don't demand the value until somewhere
> > outside of rechf.  By this point the hClose has happened and hGetContents has
> > no access to the file => no lines => no result.
> 
> I must be really dumb but I don't get why 'at this point the hClose
> has happened'
> 
> It seemed to me that when I typed at ghci's prompt    rechf "xxxx"  it
> tries to evaluate it.
> that makes it perform the IO action of opening the file, then
> performing the IO action of evaluating ( since I need the result )
> rechf2 "xxxx" and *lastly* performing the IO action of closing the
> file.

That is close, but not quite right:

1. in ghci, you type, rechf "xxxxxx" and it tries to print the result

2. that makes it perform the IO action of openning the file (and it happens immediately)

3. 'f <- hGetContents h' lazily returns the contents of the file handle as a
   string. This means it won't actually read anything from the file
   until it needs to.

4. 'return $ rech r $ lines f' also lazily returns its result. This
   means that it does not perform the computation right away, instead
   it waits for some one to actually 'use' the result. In general to
   actually 'use' a value you have to do something that interacts with
   the 'real world', such as print the value out.

5. 'hClose h' closes the file handle, and does so immediately.

6. ghci tries to print the result that rechf returned. So, this means
   we are finally doing something that will force the value. So, all
   those suspended computations try to do their thing -- except, we
   have explicitly closed the file handle already. So, when the
   suspended computations try to read from the file handle, they don't
   get anything.

When you uncomment the 'print f' line, you force lazy value f to be
evaluated right then -- which is before hClose has been called. But,
when you do not call 'print f', then nothing really needs the value of
f until *after* hClose has been called, so you get no output.

Of course, you do not want to print the entire contents of the file,
so that is where the other solutions come into play.

The function ($!) has the type:

($!) :: (a -> b) -> a -> b

When you run:

 f $! x

It forces 'x' and then runs 'f x'. In your code this means it will
force the expression:

	(rech r $ lines f)

before the function returns. Then, when hClose runs -- everything is
ok, because we already got everything out of the file that we needed.

As someone else pointed out in a different message, sometimes (f $! x)
might not be sufficient because it might not force x 'all the way'.

As this blog entry shows:

http://blogs.nubgames.com/code/?p=22

hGetContents can make for very elegant code. But, at the same time,
under the hood, hGetContents uses 'unsafeInterleaveIO'. The 'unsafe'
part is basically the exact bug you are now seeing. For this reason,
some people have suggested that hGetContents should be:

 a) removed
 
or 

 b) renamed to unsafehGetContents

In general laziness is very nice, but there are two common problem
cases that you will run into when you are first starting:

 1) Mixing IO and laziness 

 2) Space leaks (aka, using lots of RAM) and stack overflows caused by
 code being *too* lazy.

Every Haskell programmer runs into these at some point in time, and
they are very confusing at first. Unfortunately, Haskell programmers
tend to uncover these two issues at the beginning of their journey,
when they are least equiped to make sense of them :(

Hope this helps,
j.


More information about the Haskell-Cafe mailing list