Names for small functions: just say no... Re: Data.List.join

Thu Nov 9 12:06:40 EST 2006

On 2006-11-02 at 13:01+0100 "Josef Svenningsson" wrote:
> On 10/30/06, Jon Fairbairn <jon.fairbairn at cl.cam.ac.uk> wrote:
> > On 2006-10-30 at 14:03+0100 "Josef Svenningsson" wrote:
> > is essentially the is/ought problem in philosophy. If your
> > argument had been that all these uses could have been
> > written more concisely using intercalate, I couldn't have
> > disputed it, but I would still have made the argument I did
> > about not giving names to small functions. Instead, you are
> > trying to argue that they ought to have been written using
> > intercalate because it's the right abstraction, and whether
> > or not that conclusion is true, the data neither supports
> > nor undermines it.
> >
> How do you find new abstractions? A simplified description might be:
> You look at previous work and see if there is a pattern emerging. When
> you spot a pattern you give that a name. And there you have it, a new
> abstraction.

But not all patterns deserve names. To give a
reductio-ad-absurdam sort of analogy, suppose someone looked
at a lot of C and came to the conclusion that “for (i=0,
i++, ...)” occurred in 90% of programmes.  Were they to
reason that it would be worth #defining a macro FORI(...)
and using that instead, I suspect that the suggestion would
be roundly dismissed.

> Do you disagree with this way of working? We learn from what we have
> done before and find new and better ways of doing the same thing. We
> spot new structures and abstractions and learn how to think about
> things at a higher level.
> I think it is a very sound principle to look at old code and find new
> a better ways of writing it. Indeed, if I wrote the code I was
> probably a lesser programmer at the time I wrote it. But that doesn't
> mean that I cannot learn from the code and improve it, does it?

Of course not, but you have to be careful what you deduce
from the old code, particularly if you are reasoning from
"most code is like this", because that doesn't tell you
whether most code should be like that or most code should
have been written differently. Another silly analogy: if
most pancakes are served with maple syrup, that doesn't mean
that we should always add maple syrup to the mix.  It
certainly suggests it as a worthwhile experiment, but since
I eat my pancakes with fish, I could hardly be expected to
approve.  That exact argument doesn't apply to the specific
case of intercalate, because here it's possible to get the
maple syrup back out of the pancake.

> Or do I misunderstand you again?

I've been trying to conduct several distinct arguments, some
general and some specific here, and I think it's confusing
enough that I'd better list them. The first is the general
one that adding a name for a small function is only
worthwhile if it really is a new abstraction, otherwise it
increases vocabulary without increasing expressiveness.

The second is the general one that one has to be very
careful about what one deduces from "most code is like
this", which applies here in the specific form that, OK,
you've found that most occurrences of intersperse are with
concat, but what can you deduce from that? Certainly, all
those programmes could be very slightly shorter had we put
intercalate in the standard prelude in the first place. But
it doesn't tell us whether we should have done that.

Then there's the specific question of whether intercalate is
actually a better abstraction than intersperse. “concat
. intersperse ", "” is hardly difficult to read, and I find
“intercalate [", "] . map (:[])” (for the reverse case)
slightly harder to read -- though I don't know whether
that's down to habituation, so we're just arguing intuitions
here.

Finally, there's another specific one, that whatever the
merits of intercalate, intersperse got there first, which
means that unless there's a really strong argument that
intercalate is the better abstraction, it's better to leave
things alone rather than increase the number of ways of
expressing the same concept.

> I agree that it is not optimal to have both intersperse and
> intercalate in the library. But I'd rather have that than not have
> intercalate since I find it so useful.

People always have the option of defining it for
themselves. I certainly think that in the end, if this
function does get defined lots of times (which, supposing
it's not put in library, it might now that we've discussed
it so much) then making sure that they all have the same
name is a Good Thing.  But if it's put in a library and a
significant proportion go on writing “concat . intersperse
"..."”, it won't be.  And once something has gone into a
library, it's awfully hard (read impossible) to get back out
again.

 Jón

-- 
Jón Fairbairn                              Jon.Fairbairn at cl.cam.ac.uk