# User:Benmachine/Non-strict semantics

### From HaskellWiki

Benmachine (Talk | contribs) |
Benmachine (Talk | contribs) (→How do I stop it?) |
||

Line 62: | Line 62: | ||

As mentioned above, non-strictness can hurt performance: if a result is definitely going to be needed later, you might as well evaluate it now, to avoid having to hold on to all the data that goes into it. Fortunately, the Haskell designers were aware of these problems and introduced a loophole or two so that we could force our programs to be strict when necessary. |
As mentioned above, non-strictness can hurt performance: if a result is definitely going to be needed later, you might as well evaluate it now, to avoid having to hold on to all the data that goes into it. Fortunately, the Haskell designers were aware of these problems and introduced a loophole or two so that we could force our programs to be strict when necessary. |
||

− | The most basic method of strictness introduction is the <hask>seq</hask> function. <hask>seq</hask> takes two arguments of any type, and returns the second. However, it also has the important property that it is magically strict in its first argument; in essence <hask>seq</hask> is defined by the following two equations: |
+ | The most basic method of strictness introduction is the <tt>seq</tt> function. <tt>seq</tt> takes two arguments of any type, and returns the second. However, it also has the important property that it is magically strict in its first argument; in essence <tt>seq</tt> is defined by the following two equations: |

<haskell> |
<haskell> |

## Revision as of 21:02, 4 May 2012

An expression language is said to have **non-strict semantics** if expressions can have a value even if some of their subexpressions do not. Haskell is one of the few modern languages to have non-strict semantics by default: nearly every other language has strict semantics, in which if any subexpression fails to have a value, the whole expression fails with it.

### 1 What?

Any sufficiently capable programming language is *non-total*, which is to say you can write expressions that do not produce a value: common examples are an infinite loop or unproductive recursion, e.g. the following definition in Haskell:

noreturn :: Integer -> Integer noreturn x = negate (noreturn x)

or the following Python function:

def noreturn(x): while True: x = -x return x # not reached

both fail to produce a value when executed. We say that `noreturn x` is undefined, and write `noreturn x = ⊥`.

In Python the following expression:

{7: noreturn(5), 2: 0}[2]

also fails to have a value, because in order to construct the dictionary, the interpreter tries to work out `noreturn(5)`, which of course doesn't return a value. This is called **innermost-first** evaluation: in order to call a function with some arguments, you first have to calculate what all the arguments are, starting from the innermost function call and working outwards. The result is that Python is strict, in the sense that calling any function with an undefined argument produces an undefined value, i.e. `f(⊥) = ⊥`.

In Haskell, an analogous expression:

lookup 2 [(7, noreturn 5), (2, 0)]

**outermost-first**evaluation because you first look at

### 2 Why?

The important thing to understand about non-strict semantics is that it is not a performance feature. Non-strict semantics means that only the things that are needed for the answer are evaluated, but if you write your programs carefully, you'll only compute what is absolutely necessary *anyway*, so the extra time spent working out what should and shouldn't be evaluated is time wasted. For this reason, a very well-optimised strict program will frequently outperform even the fastest non-strict program.

However, the real and major advantage that non-strictness gives you over strict languages is you get to write cleaner and more composable code. In particular, you can separate *production* and *consumption* of data: don't know how many prime numbers you're going to need? Just make `primes` a list of *all* prime numbers, and then which ones actually get *generated* depends on how you use them in the rest of your code. By contrast, writing code in a strict language that constructs a data structure in response to demand usually will require first-class functions and/or a lot of manual hoop-jumping to make it all behave itself.

What this means in practice is that in Haskell you can often write your function as a simple composition of other general functions, and still get the behaviour you need, e.g:

any :: (a -> Bool) -> [a] -> Bool any p = or . map p

any p [] = False any p (x:xs) | p x = True | otherwise = xs

### 3 How do I stop it?

As mentioned above, non-strictness can hurt performance: if a result is definitely going to be needed later, you might as well evaluate it now, to avoid having to hold on to all the data that goes into it. Fortunately, the Haskell designers were aware of these problems and introduced a loophole or two so that we could force our programs to be strict when necessary.

The most basic method of strictness introduction is the `seq` function. `seq` takes two arguments of any type, and returns the second. However, it also has the important property that it is magically strict in its first argument; in essence `seq` is defined by the following two equations:

⊥ `seq` b = ⊥ a `seq` b = b

The important thing to notice about `seq` is that it doesn't evaluate anything just by virtue of being in the code. All it does is introduce an artificial dependency of one value on another: when the result of `seq` is evaluated, the first argument must also be evaluated. As an example, suppose `x :: Integer`, then `seq x b` is essentially a bit like `if x == 0 then b else b` - unconditionally equal to `b`, but forcing `x` along the way. In particular, the expression `x `seq` x` is completely redundant, and always has exactly the same effect as just writing `x`.

*is*the case that evaluating

*then*

*only*way to force evaluation of a value with a function type.