Repair to floating point enumerations?

Wed Oct 15 16:54:25 EDT 2008

Here's a counter-proposal:

numericEnumFromThenTo   :: RealFloat a => a -> a -> a -> [a]
numericEnumFrom 0 = map fromInteger [1..]
numericEnumFrom n = map ((n+).fromInteger) [1..]
numericEnumFromThen n m =  map (\x -> n+fromInteger x*(m-n)) [1..]
numericEnumFromTo n m = takeWhile (<= m*(1 + epsilon)) (numericEnumFrom n)
numericEnumFromThenTo n m p = takeWhile (<= p*(1 + 2*epsilon)) (numericEnumFromThen n m)

epsilon :: RealFloat a => a
epsilon = 1/2^(floatDigits (undefined :: a))

This uses quite a reasonable approximation for the roundoff error, and
has the advantage of not inappropriately returning _|_.  It does
sometimes create duplicate entries in the list, but I think that is
better that returning an infinite list of duplicate entries as the
code proposed below does.

And yes, the fuzzy comparison is a bit ugly, but at least it means
that every user is not forced to implement fuzzy comparison in their
quick and dirty code (which is the only thing this syntax is good
for).

David

On Wed, Oct 15, 2008 at 03:55:09PM +0100, Lennart Augustsson wrote:
> I'm sorry, but people who write [0.0,0.1 .. x] where x is a multiple
> of 0.1 get exactly what they deserve if x is not included.  Floating
> point numbers are not the real numbers, and the sooner they learn that
> the better.  We can fudge this all we like, but 0.1 is never going to
> be exactly representable as a binary floating point number no matter
> what we do.
> 
> On Wed, Oct 15, 2008 at 3:44 PM, David Roundy <droundy at darcs.net> wrote:
> > On Wed, Oct 15, 2008 at 10:41:25AM +0100, Malcolm Wallace wrote:
> >> Dear Haskell-Primers (and libraries).
> >>
> >> Recently, Phil Wadler has pointed out a weird anomaly in the Haskell'98
> >> Prelude, regarding numeric enumerations for Floats/Doubles:
> >>
> >>     Prelude> [0, 0.3 .. 1.1]
> >>     [0.0,0.3,0.6,0.899999,1.2]
> >>
> >> What is odd is that the last value of the list is much larger than the
> >> specified termination value.  But this is exactly as specified by the
> >> Haskell'98 Report.
> >>
> >>     http://haskell.org/onlinereport/basic.html#sect6.3.4
> >>
> >>     "For Float and Double, the semantics of the enumFrom family is given
> >>     by the rules for Int above, except that the list terminates when the
> >>     elements become greater than e3+i/2 for positive increment i, or
> >>     when they become less than e3+i/2 for negative i.
> >>
> >> We have discussed this question (and related ones, such as whether Float
> >> and Double belong in the Enum class at all) several times before, and I
> >> do not wish to rehash all of those points again e.g.:
> >>
> >>     http://www.cse.unsw.edu.au/~dons/haskell-1990-2000/msg07289.html
> >>     http://www.haskell.org/pipermail/haskell/2001-October/008218.html
> >>     http://www.haskell.org/pipermail/haskell/2002-October/010607.html
> >>
> >> Phil proposes that, although retaining the instances of Enum for Float
> >> and Double, we simplify the definitions of the numericEnumFrom family:
> >>
> >>   numericEnumFromThenTo   :: (Fractional a, Ord a) => a -> a -> a -> [a]
> >>   numericEnumFrom         =  iterate (+1)
> >>   numericEnumFromThen n m =  iterate (+(m-n)) n
> >>   numericEnumFromTo n m   =  takeWhile (<= m) (numericEnumFrom n)
> >>   numericEnumFromThenTo n m p = takeWhile (<= p) (numericEnumFromThen n m)
> >>
> >> The particular feature of note is that the odd fudge factor of (1/2 *
> >> the increment) is removed.  The inexact nature of floating point numbers
> >> would therefore cause a specification like
> >>
> >>     [ 0.0, 0.1 .. 0.3 ]
> >>
> >> to yield the sequence
> >>
> >>     [ 0.0, 0.1, 0.2 ]
> >>
> >> that is, to omit the upper bound, because (3 * 0.1) is actually
> >> represented as 0.30000000000004, strictly greater than 0.3.
> >>
> >> Phil argues that this behaviour is more desirable: "the simple fix is
> >> that the user must add a suitable epsilon to the upper bound.  The key
> >> word here is *suitable*.  The old definitions included completely
> >> bizarre and often highly unsuitable choices of epsilon."
> >>
> >> This proposal seems to me to improve the consistency of the enumeration
> >> syntax across both the integral and floating types.  Some users may
> >> still be surprised, but the surprise will be easier to explain.
> >>
> >> I am bringing this question to the attention of all who are interested
> >> in Haskell Prime, because it seems like a sensible and well-reasoned
> >> change.  Discussion on whether to adopt this proposal for H' is welcome.
> >>
> >> But as maintainer and bug-fixer of the Haskell'98 Report, I have also
> >> been asked whether we should make this change retrospectively to the
> >> Haskell'98 language (as a "typo").  Since it involves not merely an
> >> ordinary library function, but a Prelude function, and moreover a
> >> function that is used in the desugaring of syntax, it is less clear to
> >> me whether to alter Haskell'98.
> >>
> >> Thoughts?
> >
> > It sounds like a bad fix to me.  It seems important that the
> > [0.0,0.1.. x] notation should work correctly in the common cases.  And
> > the common case really is that the final value is intended as an exact
> > multiple of the increment.
> >
> > Why not look for a heuristic that gets the common cases right, rather
> > than going with an elegant wrong solution? After all, these
> > enumerations are most often used by people who neither care nor know
> > how they're implemented, but who most likely would prefer if haskell
> > worked as well as matlab, python, etc.
> >
> > One reasonable option would be to actually take into account the
> > expected roundoff error (which isn't hard to compute for simple sums
> > like this).
> >
> > It would also be a good idea to reduce that roundoff error by using
> > multiplication rather than addition to create the enumeration (which
> > also allows us to ensure that finite enumerations terminate).
> > e.g. it's a shame that length [1 .. 1e8 :: Float] fails to terminate.
> > Admittedly, this is a stupid thing to compute, but it's also stupid to
> > add many small numbers together in sequence, since it always serves to
> > increase the roundoff error.  It's true that most people's C code
> > would be just as naive, but we're writing Haskell, and you're talking
> > about the Prelude, which should be written intelligently, not using
> > the simplest code, in such a way that it won't bite programmers.
> >
> > David
> > _______________________________________________
> > Libraries mailing list
> > Libraries at haskell.org
> > http://www.haskell.org/mailman/listinfo/libraries
> >
> _______________________________________________
> Libraries mailing list
> Libraries at haskell.org
> http://www.haskell.org/mailman/listinfo/libraries