[Haskell-cafe] Re: evaluate vs seq

Wed Sep 13 04:09:39 EDT 2006

Michael Shulman wrote:
> On 9/11/06, apfelmus at quantentunnel.de <apfelmus at quantentunnel.de> wrote:
>> > * (a `seq` return a) = evaluate a *right now*, then produce an IO
>> action
>> >  which, when executed, returns the result of evaluating a.  Thus, if
>> >  a is undefined, throws an exception right now.
>>
>> is a bit misleading as there is no evaluation "right now". It's better
>> to say that (a `seq` return a) is _|_ ("bottom", i.e. undefined) when a
>> == _|_.
> 
> Sure... but what about when a is not _|_?  I would also like to
> understand the difference between `seq' and `evaluate' for arguments
> that are defined.  How would you describe that without talking about
> "when" expressions are evaluated?

Ah well, the discussion goes about denotational semantics of
Haskell. Unfortunately, I did not find a good introductory website about
this. Personally, I found the explanation from Bird and Wadler's book
about infinite lists very enlightening.

The game roughly goes as follows: denotational semantics have been
invented to explain what a recursive definition should be. I mean,
thinking of functions as things that map (mathematical) sets on sets
excludes self-referencing definitions (russell's paradox!).

The solution is to think functions of as fixed points of an iterative
process: factorial is the fixed point of
   (fac f) n = if n == 0 then 1 else n*f(n-1)
what means
   (fac factorial) n == factorial n

Now, the iterative process goes as follows:
  factorial_0 n = _|_
  factorial_1 n = fac (factorial_0) n
                = if n == 0 then 1 else _|_
  factorial_2 n = fac (factorial_1) n
                = case n of
                   0 -> 1
                   1 -> 1
                   _ -> _|_
and so on. Everytime, a more refined version of the ultimate goal
factorial is created, on says that
  _|_ == factorial_0 <= factorial_1 <= factorial_2 <= ...
(<= means "less or equal than")
That's why _|_ is called "bottom" (it's the smalled thing in the hierarchy).

This was about functions. In a lazy language, "normal" values can show a
similar behavior. For instance, we have
  _|_  <=  1:_|_  <= 1:2:_|_ <= 1:2:3:_|_ <= ... <= [1..]
That's how the infinite list [1..] is approximated. The inequalities
follow from the fact that bottom is below everything and that all
constructors (like (:)) are monotone (by definition), i.e.
  1:x <= 1:y  iff  x <= y

A function f is called *strict*, if it fulfills
  f _|_ = _|_
which means that it does not produce a constructor ("information")
without knowing what its argument is.

Back to your original question, we can now talk about functions in terms
of _|_ and *data constructors*. As an example, we want to think about
the meaning of (take 2 [1..]). What should this be? I mean, one argument
is an infinite list! By tracing the definition of (take 2), one finds
  take 2 _|_             == _|_     -- (btw (take 2) is strict)
  take 2 (1:_|_)         == 1:_|_
  take 2 (1:(2:_|_))     == 1:(2:[])
  take 2 (1:(2:(3:_|_))) == 1:(2:[])
and so on. The right and side remains equal for all further refinements,
so we must conclude
  take 2 [1..]           == 1:(2:[]).

For the evaluate and `seq` problem, we simplify things by specializing
the polymorphic type to ([Int] -> IO [Int]). Then, we introduce two
constructors (ThrowException :: IO [Int]) and (Return :: IO [Int]) with
the obvious meaning. The semantics of `seq` are now as following:
  _|_    `seq` x == _|_
  []     `seq` x == x
  (y:ys) `seq` x == x
So `seq` forces its first argument. When we define
  f x = x `seq` (Return x)
we thereby get
  f _|_    == _|_
  f []     == Return []
  f (x:xs) == Return (x:xs)
To compare, the semantics of (evaluate) is
  evaluate _|_    == ThrowException =/= _|_
  evaluate []     == Return []
  evaluate (x:xs) == Return (x:xs)
That should answer your question.

Please note that this answer is actually a lie, as functions must be
monotone (halting problem, fix id == _|_), but (evaluate) is not:
  _|_ <= []
yet
  evaluate _|_ == ThrowException   </=  evaluate [] == Return []
This can be fixed by departing from denotational semantics and giving
all (IO a)-things an operational meaning. Further explanations and
further references are found in the paper I pointed out last time.

Regards,
apfelmus