[Haskell-cafe] [Alternative] summary of my understanding so far

Mon Dec 19 07:45:20 CET 2011

On Dec 19, 2011, at 3:49 PM, Richard O'Keefe wrote:

> On 19/12/2011, at 5:46 PM, Gregory Crosswhite wrote:
> [improved Monoid documentation]

Thank you.  :-)

> I would go so far as to point out that "mappend is a generalisation of
> Data.List.sum, Data.List.product, Data.List.and, and Data.List.or,
> where the initial value and combining rule are implied by the type.

Inspired by the idea behind your suggestion, I modified the documentation as follows:

========================================================

The Monoid typeclass provides a standard interface for specifying how pairs of values of a given type can be combined to form new values of that type, as well as an identity value for that type that when combined with any value x produces x.  The Monoid class typically appears in higher-order functions that produce a list of values that need to be summarized into a single result, such as in Data.Foldable.foldMap function or the Writer monad.

Formally, an instance of Monoid provides a binary associative operator with an identity element;  to do this one must specify (at a minimum) the methods mempty and mappend such that they obey following properties:

(*) mempty is the identity:
	mempty `mappend` x = x `mappend` mempty = x
(*) mappend is associative:
	x `mappend` (y `mappend` z) = (x `mappend` y) `mappend` z

Note that this structure is very generic; it includes addition with the identity element 0 (i.e. mappend = (+), mempty = 0), multiplication with the identity element 1 (i.e. mappend = (*), mempty = 1), list concatenation with the identity element [] (i.e. mappend = (++), mempty = []), logical and with the identity element True (i.e., mappend = (&&), mempty = True), logical or with the identity element False (i.e., mappend = (||), mempty = False), etc.  Unfortunately, sometimes this very generality results in there being multiple equally sensible ways to define a Monoid instance for a type.  For example, for numeric values addition and multiplication work equally well, and for boolean values logical and and logical or work equally well.  In such cases, it is a good idea to define newtype wrappers for these types so that we can make it explicit which operation we are using.  In the case of the Int type, for example, we define the Sum and Product newtypes and make these instances of Monoid using the corresponding mathematical operator; see also Any, All, First, and Last for other examples of this.

Although not strictly necessary, for reasons of performance the Monoid typeclass also includes the method mconcat which combines all the elements in a list, i.e. it is a method which obeys the property

(*) mconcat = foldr mappend mempty

The above is the default definition of mconcat if no other is supplied, but for some times users may wish to override it when it can be performed more efficiently.  Regardless, the minimal complete definition for an instance of the Monoid typeclass is mempty and mappend.

========================================================

>> This additional information unfortunately makes the documentation more verbose,
> 
> One man's "more verbose" is another man's "less cryptic".

Don't get me wrong, I completely agree with you that adding more words for the sake of making a subject less cryptic is a net win.  :-)  There are two dangers that lurk, however.  First, there needs to be lots of that makes it easy for people to skim through and pick out the specific information that they want to find out about, and in particular the information that is most important/most urgently needed needs to be placed first so that it is the first thing that  reader sees. Second, if you take too long to explain a point then you risk having your reader get fatigued so that all that effort you put in to make things clear just ends up getting going in one eye and out the other.  :-)

> I really don't like the emphasis on Num, as if it was a bizarre feature of
> Num that there's more than one Monoid reading for it.  This is a *common*
> property of data types.  For example, Sets can be seen as monoids with
> empty and union; and Sets with a universe can also be seen as monoids with
> universe and intersection.

In the revised version above, added Booleans as another example.

> The more I think about it, the less idea I have _what_ to expect for _any_
> instance of Monoid.

This is an inherent weakness of typeclasses, and why languages like Agda use record systems where instance declarations are records that you can either pass in explicitly or import explicitly to use implicitly within a particular scope.

I think, though, that for many types, though, there really is a sort of "most intuitive"/"most natural" Monoid operation.  For lists and sequences, for example, I think that the most intuitive operation is concatenation, rather than say taking the intersection of the elements of the two arguments.  Likewise when you are accumulating over a bunch of sets of values you are probably more likely to be wanting the union of all the values you have seen so far than the intersection.  Of course, such notions are ill-formed.  :-)

Cheers,
Greg
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/haskell-cafe/attachments/20111219/5eb958d6/attachment.htm>