Haskell programming tips
From HaskellWiki
m (wording improvements) |
(→Partial functions like fromJust and head: link to Avoiding partial functions) |
||
| (25 intermediate revisions not shown.) | |||
| Line 1: | Line 1: | ||
==Preface== | ==Preface== | ||
| - | This page shows several examples of how code can be improved. We try to derive general rules from them, though they cannot be applied | + | This page shows several examples of how code can be improved. We try to derive general rules from them, though they cannot be applied deterministically and are a matter of taste. We all know this, please don't add "this is disputable" to each item! |
Instead, you can now add "this is disputable" on [[/Discussion]] and change this page only when some sort of consensus is reached. | Instead, you can now add "this is disputable" on [[/Discussion]] and change this page only when some sort of consensus is reached. | ||
| Line 9: | Line 9: | ||
===Don't reinvent the wheel=== | ===Don't reinvent the wheel=== | ||
| - | The standard libraries are full of useful | + | The standard libraries are full of useful, well-tuned functions. If you rewrite an existing library function, the reader of your code might spend a minute trying to figure out why you've done that. But if you use a standard function, the reader will either immediately understand what you've done, or can learn something new. |
| - | + | ||
| - | + | ||
===Avoid explicit recursion=== | ===Avoid explicit recursion=== | ||
| Line 36: | Line 34: | ||
and the reader knows that the complete list is processed and that each output element depends only on the corresponding input element. | and the reader knows that the complete list is processed and that each output element depends only on the corresponding input element. | ||
| - | If you don't find appropriate functions in the standard library, extract a general function. This helps you and others understand the program. Haskell | + | If you don't find appropriate functions in the standard library, extract a general function. |
| - | If you find | + | This helps you and others understand the program. |
| + | Thanks to [[higher order function]]s Haskell gives you '''very''' many opportunities to factor out parts of the code. | ||
| + | If you find the function very general, put it in a separate module and re-use it. It may appear in the standard libraries later, or you may later find that it is already there in an even more general way. | ||
Decomposing a problem this way also has the advantage that you can debug more easily. If the last implementation of <hask>raise</hask> does not show the expected behaviour, you can inspect <hask>map</hask> (I hope it is correct :-) ) and the invoked instance of <hask>(+)</hask> separately. | Decomposing a problem this way also has the advantage that you can debug more easily. If the last implementation of <hask>raise</hask> does not show the expected behaviour, you can inspect <hask>map</hask> (I hope it is correct :-) ) and the invoked instance of <hask>(+)</hask> separately. | ||
| - | '' | + | ''This is a special case of the general principle of [[separation of concerns|separating concerns]]. If you can write the loop over a data structure once and debug it, then there's no need to duplicate that code.'' |
| - | Another example: | + | Another example: the function <hask>count</hask> counts the number of elements which fulfill a certain property, i.e. the elements for which the predicate <hask>p</hask> is <hask>True</hask>. |
| - | + | ||
| - | which fulfill a certain property, | + | |
| - | i.e. the elements for which the predicate <hask>p</hask> is <hask>True</hask>. | + | |
I found the following code (but convoluted in a more specific function) in a Haskell program | I found the following code (but convoluted in a more specific function) in a Haskell program | ||
| Line 63: | Line 60: | ||
</haskell> | </haskell> | ||
. | . | ||
| - | |||
===Only introduce identifiers you need=== | ===Only introduce identifiers you need=== | ||
| Line 151: | Line 147: | ||
? It is no longer necessary to compute the length of <hask>l</hask> again and again. The code is easier to read and it covers all special cases, including <hask>tuples (-1) [1,2,3]</hask>! | ? It is no longer necessary to compute the length of <hask>l</hask> again and again. The code is easier to read and it covers all special cases, including <hask>tuples (-1) [1,2,3]</hask>! | ||
| - | ''Eliminating the <hask>length</hask> test can worsen performance dramatically in some cases, like <hask>tuples 24 [1..25]</hask>. We could also use <hask>null (drop (r-1) l)</hask> instead of <hask>length l < r</hask>, which works for infinite lists. See also [[ | + | ''Eliminating the <hask>length</hask> test can worsen performance dramatically in some cases, like <hask>tuples 24 [1..25]</hask>. We could also use <hask>null (drop (r-1) l)</hask> instead of <hask>length l < r</hask>, which works for infinite lists. See also [[#Don't ask for the length of a list, if you don't need it|below]].'' |
You can even save one direction of recursion | You can even save one direction of recursion | ||
| Line 212: | Line 208: | ||
which is both clearer and re-usable. | which is both clearer and re-usable. | ||
Actually, starting with GHC-6.6 you do not need to define <hask>comparing</hask>, since it is already in module <hask>Data.Ord</hask>. | Actually, starting with GHC-6.6 you do not need to define <hask>comparing</hask>, since it is already in module <hask>Data.Ord</hask>. | ||
| - | http:// | + | http://hackage.haskell.org/packages/archive/base/latest/doc/html/Data-Ord.html |
(Just a remark for this special example: | (Just a remark for this special example: | ||
| - | We can avoid multiple evaluations of the conversions. | + | We can avoid multiple evaluations of the conversions |
| + | with a function that is present in GHC.Exts of GHC 6.10: | ||
<haskell> | <haskell> | ||
| - | + | sortWith :: (Ord b) => (a -> b) -> [a] -> [a] | |
| - | + | sortWith f x = map snd (sortBy (comparing fst) (zip (map f x) x)) | |
</haskell> | </haskell> | ||
) | ) | ||
| Line 233: | Line 230: | ||
putStrLn $ shows lineNr $ showString ": " $ show line | putStrLn $ shows lineNr $ showString ": " $ show line | ||
</haskell> | </haskell> | ||
| - | |||
===<hask>Bool</hask> is a regular type=== | ===<hask>Bool</hask> is a regular type=== | ||
| Line 250: | Line 246: | ||
. | . | ||
| + | The definitions | ||
| + | <haskell> | ||
| + | hasSpace (a:as) | ||
| + | | isSpace a = True | ||
| + | | otherwise = hasSpace as | ||
| + | </haskell> | ||
| + | and | ||
| + | <haskell> | ||
| + | hasSpace (a:as) = if isSpace a then True else hasSpace as | ||
| + | </haskell> | ||
| + | can be shortened to | ||
| + | <haskell> | ||
| + | hasSpace (a:as) = isSpace a || hasSpace as | ||
| + | </haskell> | ||
| + | (I just wanted to show the logic transform. | ||
| + | In the particular example you would write <hask>any isSpace</hask>, of course.) | ||
| + | The same way | ||
| + | <haskell> | ||
| + | allPrintable (a:as) | ||
| + | | isSpace a = False | ||
| + | | otherwise = allPrintable as | ||
| + | </haskell> | ||
| + | and | ||
| + | <haskell> | ||
| + | allPrintable (a:as) = if isSpace a then False else allPrintable as | ||
| + | </haskell> | ||
| + | can be shortened to | ||
| + | <haskell> | ||
| + | allPrintable (a:as) = not (isSpace a) && allPrintable as | ||
| + | </haskell> | ||
| + | (but <hask>all (not . isSpace)</hask> is even better in this particular example.) | ||
| Line 384: | Line 411: | ||
which implements a factorial function. This example, like a lot of uses of guards, has a number of problems. | which implements a factorial function. This example, like a lot of uses of guards, has a number of problems. | ||
| - | The first problem is that it's nearly impossible for the compiler to check | + | The first problem is that it's nearly impossible for the compiler to check whether guards like this are exhaustive, as the guard conditions may be arbitrarily complex (GHC will warn you if you use the <code>-Wall</code> option). To avoid this problem and potential bugs through non exhaustive patterns you should use an <hask>otherwise</hask> guard, that will match for all remaining cases: |
<haskell> | <haskell> | ||
| Line 403: | Line 430: | ||
else n * fac (n-1) | else n * fac (n-1) | ||
</haskell> | </haskell> | ||
| - | Note that <hask>if</hask> has its own set of problems, for example in connection with the layout rule or that nested <hask>if</hask>s are difficult to read. See [ | + | Note that <hask>if</hask> has its own set of [[If-then-else|problems]], for example in connection with the layout rule or that nested <hask>if</hask>s are difficult to read. See [[Case]] how to avoid nested <hask>if</hask>s. |
But in this special case, the same can be done even more easily with pattern matching: | But in this special case, the same can be done even more easily with pattern matching: | ||
| Line 421: | Line 448: | ||
This may also be more efficient as <hask>product</hask> might be optimized by the library-writer... In GHC, when compiling with optimizations turned on, this version runs in O(1) stack-space, whereas the previous versions run in O(n) stack-space. | This may also be more efficient as <hask>product</hask> might be optimized by the library-writer... In GHC, when compiling with optimizations turned on, this version runs in O(1) stack-space, whereas the previous versions run in O(n) stack-space. | ||
| - | Note however, that there is a difference between this version and the previous ones: When given a negative number, the previous versions do not terminate (until StackOverflow-time), while the last | + | Note however, that there is a difference between this version and the previous ones: When given a negative number, the previous versions do not terminate (until StackOverflow-time), while the last implementation returns 1. |
| Line 435: | Line 462: | ||
or compare the following example using the advanced [[pattern guard]]s | or compare the following example using the advanced [[pattern guard]]s | ||
| - | |||
<haskell> | <haskell> | ||
parseCmd ln | parseCmd ln | ||
| Line 482: | Line 508: | ||
take _ _ = [] | take _ _ = [] | ||
</haskell> | </haskell> | ||
| - | However, they are often | + | However, they are often criticized for hiding computational complexity and producing ambiguities, see [[/Discussion]] for details. They are subsumed by the more general [[Views]] proposal, which has unfortunately never been implemented despite being around for quite some time now. |
| + | The <hask>n+k</hask> patterns are not in the [[Haskell 2010]] standard. | ||
==Efficiency and infinity== | ==Efficiency and infinity== | ||
| Line 581: | Line 608: | ||
| - | ===Choose the | + | ===Choose the appropriate fold=== |
| - | See [[Stack overflow]] for advice on which fold is appropriate for your situation. | + | See "[[Stack overflow]]" or "[[Foldr Foldl Foldl']]" for advice on which fold is appropriate for your situation. |
| Line 609: | Line 636: | ||
If you really need random access, as in the Fourier Transform, | If you really need random access, as in the Fourier Transform, | ||
| - | you should switch to [ | + | you should switch to [[Arrays]]. |
| Line 627: | Line 654: | ||
====Lists are not finite maps==== | ====Lists are not finite maps==== | ||
| - | Similarly, lists are not finite maps, as mentioned in [[efficiency hints]]. | + | Similarly, lists are not finite maps, as mentioned in [[Efficiency_hints#Data.Sequence_vs._lists | efficiency hints]]. |
| - | + | ||
===Reduce type class constraints=== | ===Reduce type class constraints=== | ||
| Line 691: | Line 717: | ||
| + | ===Avoid redundancy in data types=== | ||
| + | |||
| + | I often see data types with redundant fields, e.g. | ||
| + | <haskell> | ||
| + | data XML = | ||
| + | Element Position Name [Attribute] [XML] | ||
| + | | Comment Position String | ||
| + | | Text Position String | ||
| + | </haskell> | ||
| + | |||
| + | I suggest to factor out the common field <hask>Position</hask> | ||
| + | since this lets you handle the text position the same way for all XML parts. | ||
| + | <haskell> | ||
| + | data XML = XML Position Part | ||
| + | |||
| + | data Part = | ||
| + | Element Name [Attribute] [XML] | ||
| + | | Comment String | ||
| + | | Text String | ||
| + | </haskell> | ||
==Miscellaneous== | ==Miscellaneous== | ||
| Line 702: | Line 748: | ||
that is you don't have to specify an order of execution | that is you don't have to specify an order of execution | ||
and you don't have to worry about what computations are actually necessary. | and you don't have to worry about what computations are actually necessary. | ||
| - | + | Useful techniques are described in [[Avoiding IO]]. | |
| - | + | ||
| - | + | ||
| - | + | ||
| - | + | ||
| - | + | ||
| - | + | ||
| - | + | ||
| - | + | ||
| - | + | ||
| - | + | ||
| - | + | ||
| - | + | ||
| - | + | ||
| - | + | ||
| - | + | ||
| - | + | ||
| - | + | ||
| - | + | ||
| - | + | ||
| - | + | ||
| - | + | ||
| - | + | ||
| - | + | ||
| - | + | ||
| - | in | + | |
| - | + | ||
| - | + | ||
| - | + | ||
| - | + | ||
| - | + | ||
| - | + | ||
| - | + | ||
| - | + | ||
| - | + | ||
| - | + | ||
| - | + | ||
| - | + | ||
| - | + | ||
| - | + | ||
| - | + | ||
| - | + | ||
| - | + | ||
| - | + | ||
===Forget about quot and rem=== | ===Forget about quot and rem=== | ||
| - | They complicate handling of negative dividends. | + | They complicate handling of negative dividends. <hask>div</hask> and <hask>mod</hask> are almost always the better choice. If <hask>b > 0</hask> then it always holds |
| - | <hask>div</hask> and <hask>mod</hask> are almost always the better choice. | + | |
| - | If <hask>b > 0</hask> then it always holds | + | |
<haskell> | <haskell> | ||
a == b * div a b + mod a b | a == b * div a b + mod a b | ||
| Line 758: | Line 759: | ||
</haskell> | </haskell> | ||
The first equation is true also for <hask>quot</hask> and <hask>rem</hask>, | The first equation is true also for <hask>quot</hask> and <hask>rem</hask>, | ||
| - | but the two others are true only for <hask>mod</hask>, but not for <hask>rem</hask>. | + | but the two others are true only for <hask>mod</hask>, but not for <hask>rem</hask>. That is, <hask>mod a b</hask> always wraps <hask>a</hask> to an element from <hask>[0..(b-1)]</hask>, whereas the sign of <hask>rem a b</hask> depends on the sign of <hask>a</hask>. |
| - | That is, <hask>mod a b</hask> always wraps <hask>a</hask> to an element from <hask>[0..(b-1)]</hask>, | + | |
| - | whereas the sign of <hask>rem a b</hask> depends on the sign of <hask>a</hask>. | + | |
| - | This seems to be more an issue of experience rather than one of a superior reason. | + | This seems to be more an issue of experience rather than one of a superior reason. You might argue, that the sign of the dividend is more important for you, than that of the divisor. However, I have never seen such an application, but many uses of <hask>quot</hask> and <hask>rem</hask> where <hask>div</hask> and <hask>mod</hask> were clearly superior. |
| - | You might argue, that the sign of the dividend is more important for you, than that of the divisor. | + | |
| - | However, I have never seen such an application, | + | |
| - | but many uses of <hask>quot</hask> and <hask>rem</hask> where <hask>div</hask> and <hask>mod</hask> were clearly superior. | + | |
Examples: | Examples: | ||
| Line 780: | Line 776: | ||
* Daan Leijen: [http://www.cs.uu.nl/~daan/download/papers/divmodnote-letter.pdf Division and Modulus for Computer Scientists] | * Daan Leijen: [http://www.cs.uu.nl/~daan/download/papers/divmodnote-letter.pdf Division and Modulus for Computer Scientists] | ||
* Haskell-Cafe: [http://www.haskell.org/pipermail/haskell-cafe/2007-August/030394.html default for quotRem in terms of divMod?] | * Haskell-Cafe: [http://www.haskell.org/pipermail/haskell-cafe/2007-August/030394.html default for quotRem in terms of divMod?] | ||
| - | |||
===Partial functions like <hask>fromJust</hask> and <hask>head</hask>=== | ===Partial functions like <hask>fromJust</hask> and <hask>head</hask>=== | ||
| Line 812: | Line 807: | ||
There is also a function which returns an existing first list element in terms of <hask>Maybe</hask>: | There is also a function which returns an existing first list element in terms of <hask>Maybe</hask>: | ||
<hask>maybeToList</hask> | <hask>maybeToList</hask> | ||
| - | (See [http://www.haskell.org/pipermail/haskell-cafe/2007-March/023391.html remark].) | + | (See [http://www.haskell.org/pipermail/haskell-cafe/2007-March/023391.html remark]. |
| + | See also "[[Avoiding partial functions]]".) | ||
== Related Links == | == Related Links == | ||
Current revision
Contents |
1 Preface
This page shows several examples of how code can be improved. We try to derive general rules from them, though they cannot be applied deterministically and are a matter of taste. We all know this, please don't add "this is disputable" to each item!
Instead, you can now add "this is disputable" on /Discussion and change this page only when some sort of consensus is reached.
2 Be concise
2.1 Don't reinvent the wheel
The standard libraries are full of useful, well-tuned functions. If you rewrite an existing library function, the reader of your code might spend a minute trying to figure out why you've done that. But if you use a standard function, the reader will either immediately understand what you've done, or can learn something new.
2.2 Avoid explicit recursion
Explicit recursion is not generally bad, but you should spend some time trying to find a more declarative implementation using higher order functions.
Don't define
raise :: Num a => a -> [a] -> [a] raise _ [] = [] raise x (y:ys) = x+y : raise x ys
because it is hard for the reader to find out how much of the list is processed and on which values the elements of the output list depend. Just write
raise x ys = map (x+) ys
or even
raise x = map (x+)
and the reader knows that the complete list is processed and that each output element depends only on the corresponding input element.
If you don't find appropriate functions in the standard library, extract a general function. This helps you and others understand the program. Thanks to higher order functions Haskell gives you very many opportunities to factor out parts of the code. If you find the function very general, put it in a separate module and re-use it. It may appear in the standard libraries later, or you may later find that it is already there in an even more general way.
Decomposing a problem this way also has the advantage that you can debug more easily. If the last implementation of
This is a special case of the general principle of separating concerns. If you can write the loop over a data structure once and debug it, then there's no need to duplicate that code.
I found the following code (but convoluted in a more specific function) in a Haskell program
count :: (a -> Bool) -> [a] -> Int count _ [] = 0 count p (x:xs) | p x = 1 + count p xs | otherwise = count p xs
which you won't like after you become aware of
count p = length . filter p
.
2.3 Only introduce identifiers you need
Here is some advice that is useful for every language, including scientific prose
(http://www.cs.utexas.edu/users/EWD/transcriptions/EWD09xx/EWD993.html):
Introduce only identifiers you use.
The compiler will check this for you if you pass an option like -Wall to GHC.
In an expression like
[a | i <- [1..m]]
replicate m a
is certainly better here.
2.4 Remember the zero
Don't forget that zero is a natural number. Recursive definitions become more complicated if the recursion anchor is not chosen properly. For example the functiontuples :: Int -> [a] -> [[a]] tuples r l | r == 1 = [[el] | el <- l] | length l == r = [l] | otherwise = (map ([head l] ++) (tuples (r-1) (tail l))) ++ tuples r (tail l)
Do you have an idea what it does?
Let's strip the guards and forget about list comprehension.
tuples :: Int -> [a] -> [[a]] tuples 1 l = map (:[]) l tuples r l = if r == length l then [l] else let t = tail l in map (head l :) (tuples (r-1) t) ++ tuples r t
What about tuples with zero elements? We can add the pattern
tuples 0 _ = [[]]
but then we can also omit the pattern for 1-tuples.
tuples :: Int -> [a] -> [[a]] tuples 0 _ = [[]] tuples r l = if r == length l then [l] else let t = tail l in map (head l :) (tuples (r-1) t) ++ tuples r t
tuples :: Int -> [a] -> [[a]] tuples 0 _ = [[]] tuples r l = if r > length l then [] else let t = tail l in map (head l :) (tuples (r-1) t) ++ tuples r t
tuples :: Int -> [a] -> [[a]] tuples 0 _ = [[]] tuples _ [] = [] tuples r (x:xs) = map (x :) (tuples (r-1) xs) ++ tuples r xs
You can even save one direction of recursion
by explicit computation of the list of all suffixes provided byYou can do this with do notation
tuples :: Int -> [a] -> [[a]] tuples 0 _ = [[]] tuples r xs = do y:ys <- tails xs map (y:) (tuples (r-1) ys)
tuples :: Int -> [a] -> [[a]] tuples 0 _ = [[]] tuples r xs = concatMap (\(y:ys) -> map (y:) (tuples (r-1) ys)) (init (tails xs))
but this ends with a "Prelude.tail: empty list".
More generally, Base cases and identities
2.5 Don't overuse lambdas
Like explicit recursion, using explicit lambdas isn't a universally bad idea, but a better solution often exists. For example, Haskell is quite good at currying. Don't write
zipWith (\x y -> f x y) map (\x -> x + 42)
instead, write
zipWith f map (+42)
also, instead of writing
-- sort a list of strings case insensitively sortBy (\x y -> compare (map toLower x) (map toLower y))
write
comparing p x y = compare (p x) (p y) sortBy (comparing (map toLower))
which is both clearer and re-usable.
Actually, starting with GHC-6.6 you do not need to definehttp://hackage.haskell.org/packages/archive/base/latest/doc/html/Data-Ord.html
(Just a remark for this special example: We can avoid multiple evaluations of the conversions with a function that is present in GHC.Exts of GHC 6.10:
sortWith :: (Ord b) => (a -> b) -> [a] -> [a] sortWith f x = map snd (sortBy (comparing fst) (zip (map f x) x))
)
As a rule of thumb, once your expression becomes too long to easily be point-freed, it probably deserves a name anyway. Lambdas are occasionally appropriate however, e.g. for control structures in monadic code (in this example, a control-structure "foreach2" which most languages don't even support.):
foreach2 xs ys f = zipWithM_ f xs ys linify :: [String] -> IO () linify lines = foreach2 [1..] lines $ \lineNr line -> do unless (null line) $ putStrLn $ shows lineNr $ showString ": " $ show line
2.6 Bool is a regular type
Logic expressions are not restricted to guards and Avoid verbosity like in
isEven n | mod n 2 == 0 = True | otherwise = False
since it is the same as
isEven n = mod n 2 == 0
.
The definitions
hasSpace (a:as) | isSpace a = True | otherwise = hasSpace as
and
hasSpace (a:as) = if isSpace a then True else hasSpace as
can be shortened to
hasSpace (a:as) = isSpace a || hasSpace as
(I just wanted to show the logic transform.
In the particular example you would writeThe same way
allPrintable (a:as) | isSpace a = False | otherwise = allPrintable as
and
allPrintable (a:as) = if isSpace a then False else allPrintable as
can be shortened to
allPrintable (a:as) = not (isSpace a) && allPrintable as
3 Use syntactic sugar wisely
People who employ syntactic sugar extensively argue that it makes their code more readable. The following sections show several examples where less syntactic sugar is more readable.
It is argued that a special notation is often more intuitive than a purely functional expression. But the term "intuitive notation" is always a matter of habit. You can also develop an intuition for analytic expressions that don't match your habits at the first glance. So why not making a habit of less sugar sometimes?
3.1 List comprehension
List comprehension lets you remain in imperative thinking, that is it lets you think in variables rather than transformations. Open your mind, discover the flavour of the pointfree style!
Instead of
[toUpper c | c <- s]
write
map toUpper s.
Consider
[toUpper c | s <- strings, c <- s]
where it takes some time for the reader to discover which value depends on what other value and it is not so clear how many times
the interim valuesIn contrast to that
map toUpper (concat strings)
can't be clearer.
Compare
map (1+) list
and
mapSet (1+) set
.
If there were a standard instance for theyou could use the code
fmap (1+) pool
for both choices.
If you are not used to higher order functions for list processing you may feel you need parallel list comprehension. This is unfortunately supported by GHC now,
but it is arguably superfluous since various flavours of
3.2 do notation
do notation is useful to express the imperative nature (e.g. a hidden state or an order of execution) of a piece of code.
Nevertheless it's sometimes useful to remember that theInstead of
do text <- readFile "foo" writeFile "bar" text
one can write
readFile "foo" >>= writeFile "bar"
.
The code
do text <- readFile "foo" return text
can be simplified to
readFile "foo"
by a law that each Monad must fulfill.
You certainly also agree that
do text <- readFile "foobar" return (lines text)
is more complicated than
liftM lines (readFile "foobar")
.
By the way, the3.3 Guards
Disclaimer: This section is NOT advising you to avoid guards. It is advising you to prefer pattern matching to guards when both are appropriate. -- AndrewBromage
Guards look like
-- Bad implementation: fac :: Integer -> Integer fac n | n == 0 = 1 | n /= 0 = n * fac (n-1)
which implements a factorial function. This example, like a lot of uses of guards, has a number of problems.
The first problem is that it's nearly impossible for the compiler to check whether guards like this are exhaustive, as the guard conditions may be arbitrarily complex (GHC will warn you if you use the-Wall option). To avoid this problem and potential bugs through non exhaustive patterns you should use an -- Slightly improved implementation: fac :: Integer -> Integer fac n | n == 0 = 1 | otherwise = n * fac (n-1)
-- Less sugar (though the verbosity of if-then-else can also be considered as sugar :-) fac :: Integer -> Integer fac n = if n == 0 then 1 else n * fac (n-1)
But in this special case, the same can be done even more easily with pattern matching:
-- Good implementation: fac :: Integer -> Integer fac 0 = 1 fac n = n * fac (n-1)
Actually, in this case there is an even more easier to read version, which (see above) doesn't use Explicit Recursion:
-- Excellent implementation: fac :: Integer -> Integer fac n = product [1..n]
Note however, that there is a difference between this version and the previous ones: When given a negative number, the previous versions do not terminate (until StackOverflow-time), while the last implementation returns 1.
Guards don't always make code clearer.
Compare
foo xs | not (null xs) = bar (head xs)
and
foo (x:_) = bar x
or compare the following example using the advanced pattern guards
parseCmd ln | Left err <- parse cmd "Commands" ln = BadCmd $ unwords $ lines $ show err | Right x <- parse cmd "Commands" ln = x
with this one with no pattern guards:
parseCmd ln = case parse cmd "Commands" ln of Left err -> BadCmd $ unwords $ lines $ show err Right x -> x
parseCmd :: -- add an explicit type signature, as this is now a pattern binding parseCmd = either (BadCmd . unwords . lines . show) id . parse cmd "Commands"
data Foo = Foo deriving (Eq, Show) instance Num Foo where fromInteger = error "forget it" f :: Foo -> Bool f 42 = True f _ = False
*Main> f 42 *** Exception: forget it
Only use guards when you need to. In general, you should stick to pattern matching whenever possible.
3.4 n+k patterns
In order to allow pattern matching against numerical types, Haskell 98 provides so-called n+k patterns, as in
take :: Int -> [a] -> [a] take (n+1) (x:xs) = x: take n xs take _ _ = []
However, they are often criticized for hiding computational complexity and producing ambiguities, see /Discussion for details. They are subsumed by the more general Views proposal, which has unfortunately never been implemented despite being around for quite some time now.
The4 Efficiency and infinity
A rule of thumb is: If a function makes sense for an infinite data structure but the implementation at hand fails for an infinite amount of data, then the implementation is probably also inefficient for finite data.
4.1 Don't ask for the length of a list when you don't need it
Don't write
length x == 0
In contrast
x == []
The best thing to do is
null xlengthatLeast :: Int -> [a] -> Bool atLeast 0 _ = True atLeast _ [] = False atLeast n (_:ys) = atLeast (n-1) ys
atLeast :: Int -> [a] -> Bool atLeast n x = n == length (take n x)
or non-recursive but fairly efficient
atLeast :: Int -> [a] -> Bool atLeast n = if n>0 then not . null . drop (n-1) else const True
or
atLeast :: Int -> [a] -> Bool atLeast 0 = const True atLeast n = not . null . drop (n-1)
The same problem arises if you want to shorten a list to the length of another one by
take (length x) y
So, instead
zipWith const y x
works well.
It should be noted thatwhich allow the usage of Peano numbers.
4.2 Don't ask for the minimum when you don't need it
The functionisLowerLimit :: Ord a => a -> [a] -> Bool isLowerLimit x ys = x <= minimum ys
Compare it with
isLowerLimit x = all (x<=)
4.3 Use sharing
If you want a list of lists with increasing length and constant content, don't write
map (flip replicate x) [0..]
because this needs quadratic space and run-time. If you code
iterate (x:) []
then the lists will share their suffixes and thus need only linear space and run-time for creation.
4.4 Choose the appropriate fold
See "Stack overflow" or "Foldr Foldl Foldl'" for advice on which fold is appropriate for your situation.
5 Choose types properly
5.1 Lists are not good for everything
5.1.1 Lists are not arrays
Lists are not arrays, so don't treat them as such.
Frequent use ofThis is very inefficient.
If you access the elements progressively, as in
[x !! i - i | i <- [0..n]]
you should try to get rid of indexing, as in
zipWith (-) x [0..n]
.
If you really need random access, as in the Fourier Transform, you should switch to Arrays.
5.1.2 Lists are not sets
If you manage data sets where each object can occur only once and the order is irrelevant, if you use list functions like
frequently, you should think about switching to sets. If you need multi-sets, i.e. data sets with irrelevant order but multiple occurrences of objects,
you can use a
5.1.3 Lists are not finite maps
Similarly, lists are not finite maps, as mentioned in efficiency hints.
5.2 Reduce type class constraints
5.2.1 Eq type class
When using functions likeExample:
The following function takes the input listClear what it does? No? The code is probably more understandable
removeEach :: (Eq a) => [a] -> [[a]] removeEach xs = map (flip List.delete xs) xs
but it should be replaced by
removeEach :: [a] -> [[a]] removeEach xs = zipWith (++) (List.inits xs) (tail (List.tails xs))
5.3 Don't use Int when you don't consider integers
Before using integers for each and everything (C style) think of more specialised types.
If only the valuesIf there are more but predefined choices and numeric operations aren't needed try an enumeration.
Instead of
type Weekday = Int
write
data Weekday = Monday | Tuesday | Wednesday | Thursday | Friday | Saturday | Sunday deriving (Eq, Ord, Enum)
You cannot accidentally mix up weekdays with numbers and the signature of a function with weekday parameter clearly states what kind of data is expected.
If an enumeration is not appropriate
you can define aE.g. if you want to associate objects with a unique identifier,
you may want to choose the typenewtype Identifier = Identifier Int deriving Eq
5.4 Avoid redundancy in data types
I often see data types with redundant fields, e.g.
data XML = Element Position Name [Attribute] [XML] | Comment Position String | Text Position String
since this lets you handle the text position the same way for all XML parts.
data XML = XML Position Part data Part = Element Name [Attribute] [XML] | Comment String | Text String
6 Miscellaneous
6.1 Separate IO and data processing
It's not good to use the IO Monad everywhere, much of the data processing can be done without IO interaction. You should separate data processing and IO because pure data processing can be done purely functionally, that is you don't have to specify an order of execution and you don't have to worry about what computations are actually necessary. Useful techniques are described in Avoiding IO.
6.2 Forget about quot and rem
They complicate handling of negative dividends.a == b * div a b + mod a b mod a b < b mod a b >= 0
Examples:
- Conversion from a continuously counted tone pitch to the pitch class, like C, D, E etc.: mod p 12
- Pad a list to a multiple ofxsnumber of elements:mxs ++ replicate (mod (- length xs) m) pad
- Conversion from a day counter to a week day: mod n 7
- Pacman runs out of the screen and re-appears at the opposite border: mod x screenWidth
See
- Daan Leijen: Division and Modulus for Computer Scientists
- Haskell-Cafe: default for quotRem in terms of divMod?
6.3 Partial functions like fromJust and head
Avoid functions that fail for certain input values like They raise errors that can only be detected at runtime. Think about how they can be avoided by different program organization or by choosing more specific types.
Instead of
if i == Nothing then deflt else fromJust i
write
fromMaybe deflt i
See also #Reduce type class constraints.
If it is not possible to avoidfromMaybe (error "Function bla: The list does always contains the searched value") (lookup key dict)
(See remark. See also "Avoiding partial functions".)
