Personal tools

Physical equality

From HaskellWiki

Revision as of 18:30, 29 July 2006 by Kyagrd (Talk | contribs)

Jump to: navigation, search

Contents


1 Why we need this

\paragraph{ }
\title{Object Equality and Evaluation Strategy of Lazy Lagnuages}
\author{Ahn, Ki Yung}

Lazy languages like Haskell have a good excuse for
not having object equality (or physical equality):
the equality test is a hack for forcing evaluation.
We can make use of object equality in Haskell if we can provide a resonable
alternative for forcing evaluation; it would be even better if we can
automatially decide when to evaluate. Therefore, we suggest we should
search for a evaluation strategy that will dismiss the use of equality
test as forcing evaluation.

\paragraph{ }
Using equality test for forcing evaluation is agints design purpose of Haskell.
Hakell has only one equality operator unlike Unlike Scheme or ML,
which is to be simple and clean. But we have assigned strange operational
semantics on equality. This is also problematic for partial evaluation;
we cannot replace \verb|x==x| to \verb|True|.

\paragraph{ }
Functional programming style of persistant data structure tend to
generate shared objects frequently. Functional languages lacking
object equality cannot have full-benefit of sharing
since they cannont optimize equality with object equality.
Functional languages that cannot utilize object equality is
giving up the optimization opportunity on the best-fitting domain.
It is a common idiom to check object equality before structural equality.
Object equality implies structural equality,
unless we are dealing with non-deterministic objects.
\footnote{There are rare exceptions such as OCaml \texttt{nan}, which is in my opinion a design flaw.}

\paragraph{ }
Therefore, our goal is to develope a evaluation strategy that intelligently
forces evaluation when it is sure that forcing evaluation will not take too
long and save space as well. For this we would need both static analysis and
evaluation strategy at runtime.
When we rely less on operational semantics of equality,
we can have better optimization opportunities
such as using object equality in equality tests.

2 Problems in optimizing equality

It is a common idiom to benifit from physical equalitin in C++.

Common idom of C++ copy ctor, assignment and equality:

class C
{
...
   C(const C& c)
   {
       if (*this == &c) return; // the same object!! nothing to do
       // do the job if not the same object
   }

   C& operator = (const C& c)
   {
      if (*this == &c) return; // the same object!! nothing to do
      // do the job if not the same object
   }

   bool operator == (const C& c)
   {
      if (*this == &c) return true; // the same object!! nothing to do
      // do the job if not the same object
   }
...
};


However, haskell being a lazy langauge it has concerns that strict languages would not have. Let === be the physical equality. Even though === implies == except for bottom, there are some situations that we cannot by default optimize equality with physical equality. It is because haskell programmer use reflexive equality test as a common idom for deep evaluation.


2.1 Analyzing haskell insertion sort algorithm

These examples came up doing one of the homework in CS584 Algorithms (2006 Spring) class at Portland State University, lectured by Mark P. Jones. I choose haskell to implement sorting but memory leaked I added the counter. Thanks to Mark P. Jones for mentioning that equalities are used for forcing evaluation.

\paragraph{Insertion sort}

    inssrt [] = []
    inssrt (x:xs) = ins x $ inssrt xs
      where ins z [] = [z]
            ins z l@(y:ys) | z <= y    = z : l
                           | otherwise = y : ins z ys

We test for the worst case when the length of the input is 1000 and it works fine.

Main> inssrt [1000,999..1]
[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,
... ... ,988,989,990,991,992,993,994,995,996,997,998,999,1000]

\paragraph{Insertion sort with counter that leaks memory}

    inssort p@(_,[]) = p
    inssort (n,x:xs) = ins x $ inssort (n,xs)
      where ins z (n,[]) = (n,[z])
            ins z (n,l@(y:ys))
              | z <= y    = (n+1, z:l)
              | otherwise = (n', y:ys') where (n',ys') = ins z (n+1,ys)

We test for the worst case when the length of the input is 1000 and it crashes because of memory leak due to constructing lengthy expression $1+1+1+\cdots $.

Main> inssort (0,[1000,999..1])
(Segmentation fault

\paragraph{Insertion sort with counter patched by obscure equality}

    inssort' p@(_,[]) = p
    inssort' (n,x:xs) = ins x $ inssort' (n,xs)
      where ins z (n,[]) = (n,[z])
            ins z (n,l@(y:ys))
                | n==n && z <= y    = (n+1, z:l)
                | n==n && otherwise = (n', y:ys') where (n',ys') = ins z (n+1,ys)

We test for the worst case when the length of the input is 1000 and it works because of the equality check forces evaluation of the counter value.

Main> inssort' (0,[1000,999..1])
(499500,[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,
... ... ,986,987,988,989,990,991,992,993,994,995,996,997,998,999,1000])

3 Implementing physical equality in Haskell

I tried Hugs Version: 20050308 and GHC version 6.4.2 but only GHC worked. In Hugs, StablePtr a is not an instance of Eq.

implementation:

import Foreign
import System.IO.Unsafe
 
x === y = unsafePerformIO $
  do
    px <- newStablePtr x
    py <- newStablePtr y
    let ret = px == py
    freeStablePtr px
    freeStablePtr py
    return ret

usage:

data S a = S a
 
s = S (S [1..])
s2 = S (S [1..])
s3 = s
   ___         ___ _
  / _ \ /\  /\/ __(_)
 / /_\// /_/ / /  | |      GHC Interactive, version 6.4.2, for Haskell 98.
/ /_\\/ __  / /___| |      http://www.haskell.org/ghc/
\____/\/ /_/\____/|_|      Type :? for help.

Loading package base-1.0 ... linking ... done.
Compiling Main             ( Tst.hs, interpreted )
Ok, modules loaded: Main.
*Main> s === s
True
*Main> s === s2
False
*Main> s === s3
True

This implementation works and concurrncy or garbage collection does not affect the result. But it is very slow since newStablePtr, freeStablePtr operations are expensive because it changes the state of the garbage collector. Cheap operation that just peeks the address is needed for the implementation. I think the best way is to provide a primitive operator for this.

4 Related Papers and Websites