[Haskell-cafe] On the purity of Haskell

Mon Jan 2 06:46:05 CET 2012

Steve Horne <sh006d3592 at blueyonder.co.uk> wrote:

> The do-notation sugar may confuse the issue - the <- looks like an
> operator, but translating to binds-and-lambdas form suggests
> otherwise. Quick translations (I hope no mistakes) with lots of
> parens...
>
>    m >>= (\x -> (f x x))
>
>    m >>= (\x -> (m >>= (\y -> (f x y))))

First of all, you can omit all parentheses, because the (>>=) operator
is associative by definition (this is not checked by the compiler
though).

> At first sight, these two expressions can give different results for
> reasons other than evaluation order. In particular, there are two bind
> operators, not just one.
>
> That is, x and y could get different values for reasons other than the
> two m references referring to different things. So... is that true?

Yes, that's correct.  Remember that referential transparency does not
say that you are not allowed to pass different values to the same
function.  It just forbids functions to return different results for
different inputs, and this is true for both expressions you showed.

Using a state monad to get more concrete, let's say that 'm' returns the
current state and then increments it.  If the start state is 3, then 'x'
will become 3 on execution and 'y' will become 4.  However, the result
of the composition is /always/ the state computation that returns the
result of 'f' applied to the current state and the incremented current
state.

Just like when you write "x = getLine", then you can safely replace any
occurence of 'getLine' by 'x'.

> Of course even the bind operator arguably isn't primitive. We could
> translate to get rid of those too, and see what lies underneath. This
> is where we start seeing functions of type...
>
>    World -> (x, World)
>
> Problem - this level of abstraction is hypothetical. It is not part of
> the Haskell language. Haskell specifically defines the IO monad to be
> a black box.

And that's fine, because IO is an embedded DSL.  A better view of IO is
a GADT like:

    data IO :: * -> * where
        GetLine  :: IO String
        PutStrLn :: String -> IO ()
        ...

This is still hypothetical, but it shows how even IO is easily
referentially transparent (as long as you don't use unsafe* cheats).
Furthermore you have to consider that /execution/ of IO actions is
entirely outside the scope of the language itself.  A compiler turns an
expression of type "IO ()" into machine code that executes the encoded
recipe.  In any case, you have to differentiate between evaluation and
IO execution.

> If main returns a function of type "World -> (x, World)" wrapped in a
> monad context, then there is referential transparency as defined in
> computer science. But is that a fair claim?

Even though the World state monad is a bad intuition in my opinion, why
not?  Evaluation is referentially transparent, because you don't
evaluate to results of IO actions.  You evaluate to the IO actions
themselves.  This is not a hack, it is a great feature, because it's
precisely what allows us to build real world actions using combinators.

> In this model, Haskell is an interpreted language for compositing
> functions. We can call those functions programs. The executable is a
> translation of the function returned by main, but *not* a translation
> of the source code.

It is a translation of the source code, because evaluation happens at
run-time, interleaved with the execution of 'main', as 'main' goes along
forcing thunks (in a thunk-based implementation).

> So what main returns - that hypothetical function World -> (x, World)
> - isn't just a product of the program, it's also a representation of
> the program.

And there comes the limitation of this intuition.  It is so theoretical
that you can't express a function of that type even as machine code.  On
the other hand, the GADT variant could indeed be turned into bytecode
and then interpreted by a bytecode machine.

But this is not how it works in, say, GHC.  GHC compiles to native code
using the thunk approach I mentioned earlier.

> When I say that Haskell lacks referential transparency because the
> execution of primitive IO actions is tied to the evaluation of the
> bind operators that extract out their results, and different
> executions of the same action yield different results, I'm only
> appealing to the defined semantics of the Haskell language. I'm not
> appealing to a hypothetical model where the world is passed as a
> parameter.

This is where your mistake is again:  execution vs. evaluation.  The
(>>=) operator does /not/ extract results.  It doesn't have to.  It just
makes sure (abstractly!) that the second argument is applied to the
result of the first argument at execution time.  It builds a data
dependency between monadic actions.

(>>=) is composition, not execution.  Execution is outside the scope of
the monadic interface, and for the specific case of IO it's even outside
the scope of the language.

If you think about it, you will find that (>>=) can't even extract
results.  The only thing it can do with results is to make other things
dependent on them.  This is because the type of the result is fully
polymorphic.

> So - the issue seems to be whether the IO monad is a context holding
> world-manipulating functions, or whether it's a black box with
> semantics specified at the bind level. And if referential transparency
> is decided at this level, what practical relevance does it have?

It's a black box and that's fine.  There is nothing impure about IO.

> It's probably better to stick with "the functional core is
> referentially transparent - IO actions break that, so don't overuse
> the IO monad". You can argue "may or may not break that depending on
> your viewpoint", but if you can't objectively decide which viewpoint
> is correct, then you haven't proven referential transparency.

You can objectively decide.  Referential transparency means that an
expression cannot magically evaluate to something this time and to
something other the next time, be it an Int value, a function or an IO
action.  'getLine' will always be 'getLine', and 'putStrLn "abc"' will
always be the action that prints "abc" with a line feed.  If you /had/ a
concrete representation of 'putStrLn "abc"' (like with the GADT), then
you could safely replace 'putStrLn "abc"' in the code by this
representation.  This is what you can do with all monads and everything
else in Haskell.

Again, I'm assuming you don't cheat.  If you do cheat, then you are
turning execution into evaluation, thereby potentially breaking
referential transparency.

Greets,
Ertugrul

-- 
nightmare = unsafePerformIO (getWrongWife >>= sex)
http://ertes.de/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://www.haskell.org/pipermail/haskell-cafe/attachments/20120102/cc423717/attachment.pgp>