[Haskell-beginners] Absolute beginner's questions...

Vlad Skvortsov vss at 73rus.com
Tue Aug 19 15:37:06 EDT 2008


Barry Burd wrote:
> I'm just getting started with Haskell. I've written the following simple program:
>  
> module Med
>     where
>     
> -- lst = [5, 3, 7, 8, 0, 2, 4, 1, 6]
> lst = [7, 3, 1, 2, 0, 9, 8]
> numLessThan x = length([y | y <- lst, y < x])
> numGreaterThan x = length([y | y <- lst, y > x])
>  
> median = [ x | x <- lst, numLessThan x == numGreaterThan x]
>  
> I realize that this code isn't very robust. My questions are as follows:
> - I remember running this code with some values of lst (containing an odd number of unique integer values) and getting the wrong answer. Is that possible? I can't replicate the results because I don't remember what values of lst didn't work.
> - How can I add a line or two to display intermediate results (in order to debug the code if necessary)? I've tried adding putTraceMsg but I keep getting parse errors. I probably haven't fully assimilated the syntax of functional programming yet.
>   

As for the second part of your question, it usually helps to develop 
your programs "bottom up"; also try to avoid using global data (lst in 
your case), pass that as a parameter instead.

For example, start with a function which returns values from a given 
list which are less than the given value (btw, type signatures are 
really helpful too):

lessThan :: Int -> [Int] -> [Int]
lessThan x lst = [y | y <- lst, y < x]

In fact, this function can be redefined as:

lessThan x lst = filter (< x) lst

(once you get more advanced, you'll also figure out it can be 
"eta-reduced" to just 'lessThan x = filter (< x)' )

After you've defined it, it's pretty easy to test the function in an 
interpreter to see if it works as expected. Due to the fact that no 
global data is used, testing becomes real easy:

*Main> let lessThan x lst = filter (< x) lst
*Main> lessThan 5 [1, 4, 6, 2, 7]
[1,4,2]
...

Along the same lines, define a 'greaterThan' function:

greaterThan :: Int -> [Int] -> [Int]
greaterThan x lst = filter (> x) lst


Then let's make another baby step and create functions that return a 
*number* of items in the list which are less or greater than the passed 
value.

numLessThan x lst = length (lessThan x lst)
numGreaterThan x lst = length (greaterThan x lst)

Note that for improved readability we can also write them as:

numLessThan x lst = length $ lessThan x lst
numGreaterThan x lst = length $ greaterThan x lst


Now, again, we can easily test these functions from the interpreter:

*Main> numLessThan 3 [1, 4, 2, 7, 6]
2

Now we can create a function that would return a pair of values; the 
first value in the pair denotes the number of items in the list less 
than given value, the second value is the number of items in the list 
greater than the given value:

numLessGreater :: Int -> [Int] -> (Int, Int)
numLessGreater x lst = (numLessThan x lst, numGreaterThan x lst)

Let's test it as well:

*Main> numLessGreater 3 [1, 4, 2, 7, 6]
(2,3)


The target condition is when the first and the second values are equal 
in the pair returned by 'numLessGreater'. Let's write a function to test 
this condition:

isMedian (x, y) = x == y


Now we can write a function to return *all* medians from the list.

medians :: [Int] -> [Int]
medians lst = [x | x <- lst, isMedian (numLessGreater x lst)]

As you can see by testing this function, it can return any number 
(including 0) of medians, depending on the input:

*Main> medians []
[]
*Main> medians [1, 2, 3]
[2]
*Main> medians [1, 2, 3, 2]
[2,2]


Now back to the main function, 'median'. We have to think about what 
happens if this function receives an empty list. There are a couple of 
options (it may make sense to return a value of type 'Maybe Int'); here 
I chose the function to crash if given invalid input.

All our function has to do is to call 'medians' and pick a single value 
out of the result returned. By using pattern matching we make sure the 
function will crash if given empty list as an input.

median :: [Int] -> Int
median lst =
  case medians lst of
    -- All the values in the result are the same, we just pick the first one
    x:xs -> x


And again, here is how we test it:
*Main> median [1, 2, 3, 2]
2
*Main> median [1, 2, 3]
2
*Main> median []
*** Exception: median.hs:(158,2)-(160,12): Non-exhaustive patterns in case


So this is the kind of approach one (well, me ;-)) usually takes. 
Summarizing:
* build from bottom up, start with simple functions;
* test as you go;
* make your functions pure (not depending on global data), it simplifies 
development and testing;
* strive for clarity first, optimize only after measuring;
* optimize only the bottleneck(s).

Note that for the sake of simplicity I used Int's everywhere, but in 
fact it should be any comparable value.

Hope this helps.

-- 
Vlad Skvortsov, vss at 73rus.com, http://vss.73rus.com



More information about the Beginners mailing list