[Haskell] specification of sum

Tue Nov 1 20:35:10 EST 2005

On 01/11/05, Lennart Augustsson <lennart at augustsson.net> wrote:
> Cale Gibbard wrote:
> > On 01/11/05, Sebastian Sylvan <sebastian.sylvan at gmail.com> wrote:
> >
> >>On 11/1/05, Scherrer, Chad <Chad.Scherrer at pnl.gov> wrote:
> >>
> >>>I was wondering... In my experience, it's worked much better to use
> >>>
> >>>sum' = foldl' (+) 0
> >>>
> >>>than the built-in "sum" function, which leaks memory like crazy for
> >>>large input lists. I'm guessing the built-in definition is
> >>>
> >>>sum = foldr (+) 0
> >>>
> >>>But as far as I know, (+) is always strict, so foldl' seems much more
> >>>natural to me. Is there a case where the build-in definition is
> >>>preferable?
> >>
> >>The library definition in ghc actually uses foldl.
> >>It's conceivable that you may not want (+) to be non-strict for
> >>certain data types.
> >>The question then becomes, is there a case where you want _sum_ to be
> >>non-strict?
> >>
> >
> >
> > I'd argue that the likely answer is no, given that (+) is an operation
> > in an Abelian group and not a monoid. Regardless of whether (+) is
> > strict, when you compute a sum you will always have to consume the
> > entire list.
>
> (+) is not an operation in an Abelian group.  (+) is a function with
> type a->a->a for some particular a. Haskell makes no mention about (+)
> having any other properties.
>
> Furthermore, ghc has a WRONG definition of sum.  You cannot change a
> foldr into a foldl, unless the operator is associative, and (+) does
> not have to be.
>
>         -- Lennart
>
> PS.  I think additional properties of class methods would be great,
> but since Haskell cannot enforce them we should not rely on them.
>

We already do rely on them in most cases. Of course, not every
property can be proved by the compiler, but many pieces of code are
going to assume quite a lot. For example, the relation defined by (<=)
in the Ord class is generally assumed to be at least a partial order,
and in most cases, a total ordering, but nothing about its type tells
you that. Functions will simply not work as intended when <= is not a
partial order (in some cases resulting in nontermination!), and there
are quite a few places where it's even assumed to be total. Similarly,
(==) is assumed to be an equivalence relation. The monoid operations
in Data.Monoid are assumed to define a monoid. The monad laws are not
enforced, but we rely on those too. In reality, classes come with
proof obligations -- it would be nice if the compiler could somehow
summarise these, but I think it's too much at this point to expect it
to find a proof in most practical cases.

I think that the assumption that (+) and (*) in Num define something
like a ring on the given type is a sensible one. It would be better to
call the operations you had something else if they didn't (at least
approximately) satisfy the ring axioms, as people are going to look at
your notation and assume things like distributivity, etc. hold at
least in approximation. (With types like Float, you're not going to
get half of them exactly, but they'll still usually come close to
holding, as you're emulating some fragment of the real numbers.)

 - Cale