[Haskell-cafe] Mystery of an Eq instance

Fri Sep 20 19:03:49 CEST 2013

On Fri, Sep 20, 2013 at 11:17 AM, damodar kulkarni <kdamodar2000 at gmail.com>
wrote:
> Ok, let's say it is the effect of truncation. But then how do you explain
this?

Oh, it's a trunaction error all right.

> Prelude> sqrt 10.0 == 3.1622776601683795
> True
> Prelude> sqrt 10.0 == 3.1622776601683796
> True
>
> Here, the last digit **within the same precision range** in the
fractional part is different in the two cases (5 in the first case and 6 in
the second case) and still I am getting **True** in both cases.

Because you're using the wrong precisision range. IEEE floats are
stored in a binary format, not a decimal one. So values that differ by 2 in
the last decimal digit can actually be different values even though
values that differ by one in the last decimal digit aren't.

> And also observe the following:
>
> Prelude> (sqrt 10.0) * (sqrt 10.0) == 10.0
> False
> Prelude> (sqrt 10.0) * (sqrt 10.0) == 10.000000000000002
> True
> Prelude> (sqrt 10.0) * (sqrt 10.0) == 10.000000000000003
> False
> Prelude> (sqrt 10.0) * (sqrt 10.0) == 10.000000000000001
> True
> Prelude>
>
> Ok, again something like truncation or rounding seems at work but the
precision rules the GHC is using seem to be elusive, to me.
> (with GHC version 7.4.2)

Here's a quick-and-dirty C program to look at the values. I purposely
print decimal digits beyond the precision range to illustrate that,
even though we started with different representations, the actual
values are the same even if you use decimal representations longer
than the ones you started with. In particular, note that 0.1 when
translated into binary is a repeating fraction. Why the last hex digit
is a instead of 9 is left as an exercise for the reader. That this
happens also means the number actually stored when you enter 0.1 is
*not* 0.1, but as close to it as you can get in the given
representation.

#include <stdio.h>

union get_int {
  unsigned long intVal ;
  double        floatVal ;
} ;

doubleCheck(double in) {
  union get_int out ;

  out.floatVal = in ;
  printf("%.20f is %lx\n", in, out.intVal) ;
}

main() {
  doubleCheck(3.1622776601683795) ;
  doubleCheck(3.1622776601683796) ;
  doubleCheck(10.0) ;
  doubleCheck(10.000000000000001) ;
  doubleCheck(10.000000000000002) ;
  doubleCheck(10.000000000000003) ;
  doubleCheck(0.1) ;
}

> But more importantly, if one is advised NOT to test equality of two
floating point values, what is the point in defining an Eq instance?
> So I am still confused as to how can one make a *meaningful sense* of the
Eq instance?
> Is the Eq instance there just to make __the floating point types__
members of the Num class?

You can do equality comparisons on floats. You just have to know what
you're doing. You also have to be aware of how NaN's (NaN's are float
values that aren't numbers, and are even odder than regular floats)
behave in your implementation, and how that affects your
application. But the same is true of doing simple arithmetic with
them.

Note that you don't have to play with square roots to see these
issues. The classic example you see near the start of any numerical
analysis class is:

Prelude> sum $ take 10 (repeat 0.1)
0.9999999999999999
Prelude> 10.0 * 0.1
1.0

This is not GHC specific, it's inherent in floating point number
representations. Read the Wikipedia section on accuracy problems
(http://en.wikipedia.org/wiki/Floating_point#Accuracy_problems) for
more information.

Various languages have done funky things to deal with these issues,
like rounding things up, or providing "fuzzy" equality. These things
generally just keep people from realizing when they've done something
wrong, so the approach taken by ghc is arguably a good one.

       <mike
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/haskell-cafe/attachments/20130920/cb095c87/attachment.htm>