Tue Aug 19 15:37:06 EDT 2008

```Barry Burd wrote:
> I'm just getting started with Haskell. I've written the following simple program:
>
> module Med
>     where
>
> -- lst = [5, 3, 7, 8, 0, 2, 4, 1, 6]
> lst = [7, 3, 1, 2, 0, 9, 8]
> numLessThan x = length([y | y <- lst, y < x])
> numGreaterThan x = length([y | y <- lst, y > x])
>
> median = [ x | x <- lst, numLessThan x == numGreaterThan x]
>
> I realize that this code isn't very robust. My questions are as follows:
> - I remember running this code with some values of lst (containing an odd number of unique integer values) and getting the wrong answer. Is that possible? I can't replicate the results because I don't remember what values of lst didn't work.
> - How can I add a line or two to display intermediate results (in order to debug the code if necessary)? I've tried adding putTraceMsg but I keep getting parse errors. I probably haven't fully assimilated the syntax of functional programming yet.
>

As for the second part of your question, it usually helps to develop
your programs "bottom up"; also try to avoid using global data (lst in

For example, start with a function which returns values from a given
list which are less than the given value (btw, type signatures are

lessThan :: Int -> [Int] -> [Int]
lessThan x lst = [y | y <- lst, y < x]

In fact, this function can be redefined as:

lessThan x lst = filter (< x) lst

(once you get more advanced, you'll also figure out it can be
"eta-reduced" to just 'lessThan x = filter (< x)' )

After you've defined it, it's pretty easy to test the function in an
interpreter to see if it works as expected. Due to the fact that no
global data is used, testing becomes real easy:

*Main> let lessThan x lst = filter (< x) lst
*Main> lessThan 5 [1, 4, 6, 2, 7]
[1,4,2]
...

Along the same lines, define a 'greaterThan' function:

greaterThan :: Int -> [Int] -> [Int]
greaterThan x lst = filter (> x) lst

Then let's make another baby step and create functions that return a
*number* of items in the list which are less or greater than the passed
value.

numLessThan x lst = length (lessThan x lst)
numGreaterThan x lst = length (greaterThan x lst)

Note that for improved readability we can also write them as:

numLessThan x lst = length \$ lessThan x lst
numGreaterThan x lst = length \$ greaterThan x lst

Now, again, we can easily test these functions from the interpreter:

*Main> numLessThan 3 [1, 4, 2, 7, 6]
2

Now we can create a function that would return a pair of values; the
first value in the pair denotes the number of items in the list less
than given value, the second value is the number of items in the list
greater than the given value:

numLessGreater :: Int -> [Int] -> (Int, Int)
numLessGreater x lst = (numLessThan x lst, numGreaterThan x lst)

Let's test it as well:

*Main> numLessGreater 3 [1, 4, 2, 7, 6]
(2,3)

The target condition is when the first and the second values are equal
in the pair returned by 'numLessGreater'. Let's write a function to test
this condition:

isMedian (x, y) = x == y

Now we can write a function to return *all* medians from the list.

medians :: [Int] -> [Int]
medians lst = [x | x <- lst, isMedian (numLessGreater x lst)]

As you can see by testing this function, it can return any number
(including 0) of medians, depending on the input:

*Main> medians []
[]
*Main> medians [1, 2, 3]
[2]
*Main> medians [1, 2, 3, 2]
[2,2]

Now back to the main function, 'median'. We have to think about what
happens if this function receives an empty list. There are a couple of
options (it may make sense to return a value of type 'Maybe Int'); here
I chose the function to crash if given invalid input.

All our function has to do is to call 'medians' and pick a single value
out of the result returned. By using pattern matching we make sure the
function will crash if given empty list as an input.

median :: [Int] -> Int
median lst =
case medians lst of
-- All the values in the result are the same, we just pick the first one
x:xs -> x

And again, here is how we test it:
*Main> median [1, 2, 3, 2]
2
*Main> median [1, 2, 3]
2
*Main> median []
*** Exception: median.hs:(158,2)-(160,12): Non-exhaustive patterns in case

So this is the kind of approach one (well, me ;-)) usually takes.
Summarizing:
* test as you go;
* make your functions pure (not depending on global data), it simplifies
development and testing;
* strive for clarity first, optimize only after measuring;
* optimize only the bottleneck(s).

Note that for the sake of simplicity I used Int's everywhere, but in
fact it should be any comparable value.

Hope this helps.

--
Vlad Skvortsov, vss at 73rus.com, http://vss.73rus.com

```