Difference between revisions of "Simple to complex"

Latest revision as of 16:35, 6 February 2009

It is generally a good idea to construct complex functions from simpler ones rather than making simple functions special cases of complex functions. Obviously the latter strategy does not work alone, thus using it means mixing it with the former one. That leads no longer to a clear hierarchy of functions, but to an entangled graph of dependencies.

Functions

The lazy evaluation feature of Haskell tempts you to ignore the principle of building complex functions on simpler ones.

See "Eleven reasons to use Haskell as mathematician" where it is presented as an advantage that lazy evaluation automatically simplifies the computation of a cross product of two 3D vectors if only a single component of the result is requested. However, computing a single component of a cross product means computing the determinant of a $2\times 2$ -matrix, which is certainly useful of its own. So instead of using laziness for reducing the cross product to a determinant, the better concept is clearly to write a function for computing the $2\times 2$ -determinant and invoke this three times in the implementation of the cross product.

Types

Another bad example is the numerical linear algebra package MatLab. Its type hierarchy starts at complex valued matrices from which you can build more complex types. That is, there are no simpler types, no real valued matrices, complex numbers, real numbers, integers nor booleans. They have to be represented by complex valued matrices. This is not very natural since some operations like transcendent powers are not easily ported to matrices. That is many operations must check at run-time, whether the input values have appropriate properties, e.g. being $1\times 1$ -matrices. Actually, some kinds of integers and booleans (logical values) have been added later, but they interact weirdly with MatLab's base type.

The mistake, that the language designers of MatLab made, is the following: They thought MatLab would remain a special purpose language for numerical linear algebra forever, so they decided to implement only the most complex type that would ever be encountered in this domain. As MatLab grew, they extended the program to fit new needs, image import and export, GUI programming and so on, but the initial decision for a universal type didn't scale well.

It's not a good idea to mimic this in Haskell. Start with simple types and build complexer ones out of it. Make sure that you use fancy type constructs not in the core of a library, if at all. Move them as far as possible to leaf modules.

Type class methods

There are many methods in Data.List that can be generalised to other data structures, like map, filter, foldr, (++). Some people wonder why they aren't replaced by their generalized counterparts, namely the methods in the type classes.

First of all, there are didactic reasons. Higher order functions like foldr are hard to grasp for newbies. That's why they often stick to manual encoding of recursion. (See Avoid explicit recursion) Now imagine, that Data.List.foldl is no longer exposed, but only its generalization, namely the method Data.Foldable.foldl where not only the element types and the working function are generalized away, but there is even no longer a particular data structure you are working on. Nothing concrete any longer, only abstract things. The best thing to explain this abstract function, is to use an example. E.g. sum xs = List.foldl (+) 0 xs. Now, this requires to have a function List.foldl.

Close to didactic reasons there are reasons of code comprehensibility. If you know you are working on lists, map and filter tell the reader, that you are working on lists. The reader of a program doesn't need to start human type inference to deduce this. The function return may mean Just, (:[]) or Right depending on the monad. If you are working on data of type IO (Maybe a), the program reader must deduce whether return means the one of IO or the one of Maybe. If you only want to process a specific type, tell it the reader of your program.

Last but not least, there are the reasons of type safety and communication with the compiler. If the compiler knows, that you are programming for a specific type, it will check types stronger and will give more precise error messages.

Imagine you generalize Data.List.filter to

filter :: (MonadPlus m) => (a -> Bool) -> m a -> m a
filter p m = m >>= \a -> if p a then return a else mzero

This is a nice thing, but it should not replace Data.List.filter. The type inference of the compiler will fail, if you write too general code. Say, you are in GHCi, have the above definition of filter and you write

Prelude> filter Char.isUpper (return 'a')

To what monad this shall be specialised? Maybe? IO? List? Rely on type defaulting? You will certainly have to add a type annotation.

These problems are not only of interest for the standard libraries, but should be taken into account, when designing custom type classes. See the notes on Slim instance declarations.

@@ Line 35: / Line 35: @@
 They have to be represented by complex valued matrices.
 This is not very natural since some operations
-like transcendent powers are not easily ported to matrices.
+like [[Power function|transcendent powers]] are not easily ported to matrices.
 That is many operations must check at run-time,
 whether the input values have appropriate properties,
@@ Line 67: / Line 67: @@
 Higher order functions like <hask>foldr</hask> are hard to grasp for newbies.
 That's why they often stick to manual encoding of recursion.
-(See [[Things_to_avoid#Avoid_explicit_recursion]])
+(See [[Haskell programming tips#Avoid_explicit_recursion|Avoid explicit recursion]])
 Now imagine, that <hask>Data.List.foldl</hask> is no longer exposed,
 but only its generalization, namely the method <hask>Data.Foldable.foldl</hask>
@@ Line 82: / Line 82: @@
 It is sometimes not possible to generalize all the functionality that belongs together.
 Example?
-On the other hand: The generaliation can guide your decision, what belongs together,
+On the other hand: The generalisation can guide your decision, what belongs together,
 or vice versa, what you think, belongs together, should be generlized together.
 -->
 Close to didactic reasons there are reasons of code comprehensibility.
-If you know you are working on lists, 'map' and 'filter' tell the reader, that you are working on lists.
+If you know you are working on lists, <hask>map</hask> and <hask>filter</hask> tell the reader, that you are working on lists.
 The reader of a program doesn't need to start human type inference to deduce this.
 The function <hask>return</hask> may mean <hask>Just</hask>, <hask>(:[])</hask> or <hask>Right</hask> depending on the monad.
@@ Line 96: / Line 96: @@
 Last but not least, there are the reasons of type safety and communication with the compiler.
 If the compiler knows, that you are programming for a specific type,
-it will check types stronger and will give more precise error messages.
+it will check types stronger and will give more precise
+[http://www.haskell.org/pipermail/haskell-cafe/2007-September/031415.html error messages].
 Imagine you generalize <hask>Data.List.filter</hask> to
@@ Line 115: / Line 116: @@
 These problems are not only of interest for the standard libraries,
-but should be taken account, when designing custom type classes.
+but should be taken into account, when designing custom type classes.
 See the notes on [[Slim instance declaration]]s.
-See also
+== See also ==
 * Haskell-Cafe: [http://www.haskell.org/pipermail/haskell-cafe/2007-September/031410.html About mplus]

Difference between revisions of "Simple to complex"

Latest revision as of 16:35, 6 February 2009

Contents

Functions

Types

Type class methods

See also

Navigation menu

Search