Programming style question

Mark P Jones mpj@cse.ogi.edu
Thu, 10 Jan 2002 23:41:09 -0800


Hi Adrian,

| If I have defined a function like this..
| 	f <args> = <blah> <args>
| it could be re-written..
| 	f = <blah>
| 
| I had always assumed the internal representation of
| these 2 definitions would be identical (and should
| yield identical code), but it appears that isn't so
| (with ghc at least). So..
| 
|  Is there a semantic difference between the two? 
|  Which form is likely to result in faster code?

There are several differences:

- The first will not be subject to the monomorphism
  restriction; the second will require an explicit
  type signature for f to avoid that fate.

- The second will compute a value of <blah> at most
  once, then cache the result for future use.  That
  could make a program run faster, but if the result
  of <blah> takes a lot of space, then it could result
  in a space leak.  The first might end up repeating
  the computation of <blah> each time f is called.

  For example, compare the evaluation of:

    let f = (+) (sum [1..1000]) in (f 1, f 2)

  with:

    let f x = (+) (sum [1..1000]) x in (f 1, f 2)

  (Hint: run it in Hugs on a slow computer with :set +s
  to see the difference, or replace 1000 with a bigger
  number :-)

  Denotationally, the two expressions are the same.
  (In other words, they both produce the same value.)
  But the example above shows an operational difference
  in some implementation.  (As far as I can tell, however,
  nothing in the language definition either guarantees or
  prevents such behavior.)

- There could be other differences in generated code and
  operational behavior, even when <blah> is an expression
  that cannot be further evaluated without an argument.
  In Hugs, for example, the definition:

    f = \x y -> (x,y)

  will be translated into:

    f      = f'    -- note the extra indirection here!
    f' x y = (x,y)

  A compiler with more brains, of course, could do better,
  but the Haskell report doesn't specify which behavior
  you'll get in general.

Personally, I'd tend to let considerations other than
performance affect my choice.

For example, if I'd declared  f :: a -> String -> [(a, String)]
then I might use a definition like:

   f x s = [(x, s)]  -- two parameters in the type, so two
                     -- parameters in the definition

But if the type signature was  f :: a -> Parser a  and if I'd
defined:

   type Parser a = String -> [(a, String)]

then I'd write the definition of f in the form:

   f x  =  \s -> [(x, s)]   -- f is a function of one argument
                            -- that returns a parser as a result.

Just my 2 cents, however, ...

All the best,
Mark