Personal tools

Haskell Tutorial for C Programmers

From HaskellWiki

Revision as of 19:42, 28 August 2011 by Etheridge (Talk | contribs)

Jump to: navigation, search

Haskell Tutorial for C Programmers, by Eric Etheridge

Under Construction

version 3.0 - please increment with major updates
original author: Eric Etheridge
last major update by the original author: August 28, 2011
year of original release (on old haskell.org site): 2005

Contents


1 Introduction

1.1 Abstract

Many people are accustomed to imperative languagues, which include C, C++, Java, Python, and Pascal. For computer science students, Haskell is weird and obtuse. This tutorial assumes that the reader is familiar with C/C++, Python, Java, or Pascal. I am writing for you because it seems that no other tutorial was written to help students overcome the difficulty of moving from C/C++, Java, and the like to Haskell.

I write this assuming that you have checked out the Gentle Introduction to Haskell, but still don't understand what's going on.

Haskell is not 'a little different,' and will not 'take a little time.' It is very different and you cannot simply pick it up, although I hope that this tutorial will help.

I am going to put many pauses in this tutorial because learning Haskell hurt a lot, at least for me. I needed breaks, and my brain hurt while I was trying to understand.

Haskell has both more flexibility and more control than most languages. Nothing that I know of beats C's control, but Haskell has everything C does unless you need to control specific bytes in memory. So I call Haskell powerful, rather than just 'good.'

I wrote this tutorial because Haskell was very hard for me to learn, but now I love it. "Haskell is hard!" "You can't write code the way I know how!" "My brain hurts!" "There aren't any good references!" That's what I said when I was in college. There were good references, but they didn't cover the real problem: coders know C.

This abstract was pieced together by Mark Evans, here, from my own work. I have had no contact with Mark Evans, but since he did't contact me when he editted together this abstract from my work and posted it on lambda-the-ultimate, I doubt he'll care that I've taken that edit and used it as my abstract here. If he wishes, he may contact me regarding the legal status of this work. For now, I assume I still hold the copyright on all of it, including the abstract (released in the license for this site).


1.2 Downloads

In the former form of this tutorial, I had a zipped version of the html files available at this point. However, this wiki provides a "printable version" button and the tutorial is now be one long page (which may be a poor design choice, we'll see). In addition, a bundled version might quickly become out of date from minor corrections by site patrons. Therefore, it doesn't seem necessary at the moment to include a zipped bundle of the tutorial text.

Here are the source files and text for all examples for this tutorial, including all those in the sections and the large examples at the end, zipped using bzip2: bzip2 of sources, 28K, and zipped as a zip: zip of sources, 43K.

Sources for the in-text examples in the coming sections are given in the following files:

  • ExampleSectionsCode.hs
  • ExampleSectionsTry.txt

If you don't have bzip2, you can get the latest version at www.bzip.org.


1.3 License

The original form of the tutorial was available under a Creative Commons License, specifically Attribution-Share Alike 3.0. That license still applies to any form downloaded or read while that version was available. This wiki requires submitted work to be subject to a specific license, so this work is now available under this simple permissive license. I would still prefer some attribution if the work was used wholesale.


1.4 This Tutorial's Purpose and Other Online References

Many people are accustomed to imperative languagues, which include C, C++, Java, Python, and Pascal. In fact, most languages in common usage are imperative, other than LISP, Scheme, ML, and OCaml. For computer science students in high school or early college, it is virtually guaranteed that Haskell is weird and obtuse. I first encountered Haskell in the classroom when I was a freshman at UT Austin, and then in another class at UT two years later. I was only familiar with C/C++, Pascal, and QBASIC, and all of the Haskell tutorials and books seemed to assume more of my education. This tutorial assumes that the reader is familiar with C/C++, Python, Java, or Pascal. This tutorial is specifically for students of computer science, people in their first few years of college, or even in high school. I am writing for you because it seems that no other tutorial was written to help students overcome the difficulty of moving from C/C++, Java, and the like to Haskell.

1.4.1 GHC and Hugs

To learn and use Haskell, you should install GHC, and perhaps Hugs. GHC is the "de facto standard" compiler for Haskell, and almost all projects in Haskell use it. The Hugs interpreter is a simpler tool that will let you play around and learn. Start with Hugs if you are having trouble using GHC. GHC also ships with GHCi, "GHC interactive", which is a command line interpreter much like Hugs, but no GUI. Getting these programs is easy. If you use Debian, GHC and Hugs are packages. For everyone else, the homepages are here:

http://www.haskell.org/ghc/

http://www.haskell.org/hugs/

1.4.2 The Gentle Introduction to Haskell

I write this assuming that you have checked out the following tutorial, the Gentle Introduction to Haskell, but found that you still don't understand what's going on:

http://www.haskell.org/tutorial/

1.4.3 Tour of the Haskell Syntax

The Gentle Introduction to Haskell is a good reference for basic syntax. In this tutorial we will skip most syntax details until later. First we will cover defining functions in Haskell and why it is central to the language. For more syntax details, here is another tutorial, the Tour of the Haskell Syntax, which has much more specific information:

http://www.cs.utep.edu/cheon/cs3360/pages/haskell-syntax.html

You should look through the Tour, since it describes the appropriate syntax for most of the things I discuss. The Tour is useful because you can understand it without knowing everything about Haskell. Reading these can help you before, after, or during this tutorial.

1.4.4 The Prelude File

One of the best references is the source code for the Prelude, which is the file "Prelude.hs". This file holds the code for all of the general-purpose functions in the Prelude module. If any function shows up that you don't understand, you can look up its definition in source code and figure out what it's really doing. This is a very good practice for those unfamiliar with general Haskell use.

There are (at least) three ways to get a copy of the Prelude.hs file. If you download and install Hugs, Prelude.hs will be in the libraries directory. I do not think that GHC ships with the uncompiled library sources. You can download a source version of GHC from its download page. You may be able to find a gzipped copy in the ghc "libsrc" Debian package.

1.4.5 GHC Heirarchical Libraries

Another important resource is the GHC Hierarchical Libraries documentation. The data types and functions of every module are defined here, including the Prelude. Whenever you use a library function, you'll want to refer to these to find the module and specific usage. All the standard modules are well documented. If your GHC installation includes the docs, these webpages are also on your local machine.

http://www.haskell.org/ghc/docs/latest/html/libraries/index.html

1.5 Preface and Style Notes

I am not writing a Haskell reference. This is a tutorial designed to take someone having trouble understanding Haskell and help them. This tutorial is for people who, like me, needed to learn enough concepts to understand the code covered in a classroom. Haskell allows things to be done easily and clearly, but it is not easy or clear, and it can be extremely challenging for a novice. You cannot pick up Haskell code and understand it. What I have attempted to write is a tutorial that covers the common aspects of Haskell that are the most obtuse.

As the tutorial progresses, one thing should become clear about Haskell: its real power comes into play when you attack difficult problems. Because of this, I use some difficult problems in this tutorial. Don't worry if you don't understand the solutions after reading the tutorial once. Haskell is not a toy language, and even a moderately sized set of functions will include several of Haskell's complicated tools all working together. This has left educators with a dilemma: do I use ridiculously simple code in order to cover a single topic at once, or do I use something actually useful and try to explain all the pieces and how they all fit together? Many tutorials and lessons have chosen the former, but I prefer the latter. That means that each example requires a lot of explaining. Often concepts must be explained once in extremely simplistic terms, and then explained again later after other related topics have also been briefly discussed. As you read this tutorial, remember this: Haskell's real power is the fact that all of its pieces fit so well together, not just that they are good pieces.

The syntax and variable name conventions I use in this tutorial are those used in the Haskell source code and libraries, and what I learned in college. Haskell programs tend to be short, but wide. I recommend using descriptive variable names, even for indices and so forth.


2 Section I: What the Heck is Going On?

2.1 Part I: Haskell's Oddity

To begin, Haskell has no update operator. If that sentence did not make sense, then please keep reading, because this tutorial was written with you in mind. By 'update operator', I mean that the following does not happen in normal Haskell:

int a
a := 4
print a
a := 5
print a

> 4
> 5

The above programming style, i.e. 'making a variable, putting data in it, using it, then replacing the data in it, using it again' does not happen in normal Haskell. Those of you who have used LISP or Scheme will be familiar with this concept, but I am sure that the rest of you are probably baffled. Here is how Haskell works, again in pseudo-code:

print a

int a
a = 5

> 5

or

int a
a = 5

print a

> 5

The order of these actions does not matter. There is also a reason that the first example used ':=' and the second example used '='. In 'imperative' languages, storing data is an operation, and it happens in a sequence. In 'functional' languages like Haskell, the equal sign means an exact definition. In other words, each variable is equal to its value not only after the assignment statement is reached in sequence, but in fact at all points during execution.

Some of you may be saying, "That's nice, Eric, but what good is a language where everything is hardcoded? Wouldn't I have to define every variable with its correct value as I coded? Isn't 'computing' values the whole point of a 'computer'?" And you would be right; knowing results ahead of time would make computing weird. The 'redeeming' feature of Haskell is that you don't need to store data to return a result.

I am going to put many pauses in this tutorial because learning Haskell hurt a lot, at least for me. I needed breaks, and my brain hurt while I was trying to understand. Let's look at that statement again: you don't need to store data to return a result. I'll illustrate. Here is an example of a function in C:

int foo (int bar) {
	int result;
	result = bar * 10 + 4;
	return result;
}

The important part is the expression in the middle. It can also be written as follows:

int foo (int bar) {
	return bar * 10 + 4;
}

These are the same, but the second is shorter and clearer. With a function like this, you could state the following: "The value of foo(x) is equal to (x * 10 + 4)." Or, more simply, "foo(x) = x * 10 + 4". I know you're saying, "Most functions aren't that simple." That is true, but bear with me. Haskell has much more powerful tools for writing functions than most other languages, and a lot of complicated operations look this simple in Haskell. The key to using those tools will be changing the way you think from 'make data, then alter it' to 'define a function that would return the result, then apply to inputs'.


2.2 Part II: Input and Output

We'll come back to the frame of mind later. There is such a difference between Haskell and C/C++ that many concepts only make sense in conjunction with others. I need to cover the basics of several concepts before I can explain each of them fully.

Moving on, the question on everybody's mind is probably either, "How does I/O work?" or "What are these tools?" IO is one of the complicated parts of Haskell, and later I'll describe how it works. I will also describe a simple frame for programming with GHC. Until then, use GHCi or Hugs to try the examples. They have an interactive prompt where you type in expressions, like a function plus its parameters. Then they print the evaluated result. Also, variable bindings such as 'a = 4' hang around while you're using Hugs and GHCi, so my examples should work just fine. To write your own functions, you need to write a Haskell source file and load it first. Using GHC itself requires knowing some of Haskell's more complicated tools, so we'll put off learning those until we need GHC.

To use Hugs and GHCi with your own functions, you have to write a Haskell source file and load it into the interpreter. Generally, this works as follows:

  1. Open a text editor and write Haskell code.
  2. Save that code as a file with the extension '.hs', for instance, 'test.hs'.
  3. With the source file in the current directory, run Hugs or GHCi.
  4. In Hugs or GHCi, at the prompt type ':l test.hs'. That's a lowercase 'L'.
  5. Source code that needs modules, say Data.Maybe, should include 'import Data.Maybe' at the top.

Note that the module 'Prelude' is automatically imported. The Prelude module covers most basic elements of the language.


2.3 Part III: Very Basic Intro to Types

Moving on again, let's talk about tools. Haskell's greatest strength lies in the power a programmer has to define useful functions easily and clearly. Here is our earlier example from C:

int foo (int bar) {
	return bar * 10 + 4;
}

In Haskell, to write this function named foo, you would write the following:

foo bar = bar * 10 + 4

That's all, except for the type:

foo :: Int -> Int

The type reads, "foo is of type Int to Int", meaning that it takes an Int and returns an Int. Together, you write:

foo :: Int -> Int
foo bar = bar * 10 + 4

Defining functions and types is the majority of the work in Haskell, and usually they require equal amounts of time. Haskell has a variety of ways to define functions. I feel that the tutorials mentioned previously do not adequately introduce these ways, so we will discuss them in detail later once we have some tools under our belt.


2.4 Part IV: Haskell's Lists and List Comprehensions

Those of you familiar with C know that pointers are the primary object in the language. For almost all imperative languages, the most useful structure is the Array, a randomly accessible sequence of values usually stored in order in memory. Haskell has arrays, but the most-used object in Haskell is a List. A list in Haskell is accessible only at the front, and is not stored in order in memory. While this may sound atrocious, Haskell has such weird abilities that it is more natural to use a list than an array, and often faster. Let us start with the C code to compute the fibonacci numbers (starting with zero):

int fib (int n) {
	int a = 0, b = 1, i, temp;
	for (i = 0; i < n; i++) {
		temp = a + b;
		a = b;
		b = temp;
	}
	return a;
}

This is fine for computing a particular value, but things get ugly when you want to create the sequence:

int * fibArray(int n) {
	int * fibs;
	fibs = (int *)malloc((sizeof int) * n);
	for (i = 0; i < n; i++) {
		fibs[i] = a;
		temp = a + b;
		a = b;
		b = temp;
	}
	return fibs;
}

When I say 'get ugly', I mean that something is included in that function which shouldn't be there: the size of the list. The fibonacci sequence is infinite, and the code above does not represent it, only a part of it. This doesn't sound so bad, unless you don't know how many values you need initially.

In Haskell, 'fib', the function to compute a single fibonacci value, can be written as follows:

fib :: Int -> Int
fib n = fibGen 0 1 n

fibGen :: Int -> Int -> Int -> Int
fibGen a b n = case n of
	0 -> a
	n -> fibGen b (a + b) (n - 1)

This is a slight improvement over the C code, but not much. Note that the type of fibGen is "Int to Int to Int to Int", meaning that it takes three Ints and returns an Int. More on that later. Also note that this uses a recursive function. Recursion is everywhere in Haskell. Most of your 'looping' functions will involve recursion instead.

The real improvement over C comes in defining the sequence:

fibs :: [Int]
fibs = 0 : 1 : [ a + b | (a, b) <- zip fibs (tail fibs)]

Don't be scared. Once you understand this function, you will understand at least half of the intracies of Haskell. Let's take it from the top. In Haskell, lists are written as follows:

[ 4, 2, 6, 7, 2 ]
This is the list of 4, then 2, then 6, etc. The ':' operator is used to compose a list by sticking a value on the front (left). For instance:
temp = 1 : [ 4, 2, 5 ]
is the list [ 1, 4, 2, 5 ].

That means that in the above code, fibs is a list of Int, and its first two values are zero and one. That's good so far. At least the first two values of 'fibs' will be right. The next part definitely looks weird. It's a tool found in Haskell called a 'list comprehension'. In Part II, I said that instead of making space and then filling it with the right values, you can define the right values. Here's that sentence, restated for list comprehensions: "You can define the values of the list rather than make space and then fill them in." List comprehensions work like so:

[ func x | x <- list, boolFunc x ]

This expression makes a new list. In the middle, there's a 'list', and it spits out values called x. These are the values of the list in order. If 'boolFunc x' is True, then x will get used in this new list. No boolFunc is in the 'fibs' example, but I include it here because it can also be extremely handy. Assuming 'boolFunc x' was true, 'func x' applies some function to the value of x, and the result is then put next in line in the final result. Here's an example of a list and its use in some list comprehensions, copied from using GHCi:

Prelude> let nums = [ 4, 2, 6, 8, 5, 11 ]
Prelude> [ x + 1 | x <- nums ]
[5,3,7,9,6,12]
Prelude> [ x * x | x <- nums, x < 7 ]
[16,4,36,25]
Prelude> [ 2 * x | x <- 9 : 1 : nums ]
[18,2,8,4,12,16,10,22]
Prelude> [ "String" | x <- nums, x < 5 ]
["String","String"]
Prelude>

Note that the order was preserved in each case. This is very important for our example. Also note that the type of the list comprehension was not necessarily the type of nums, nor did x actually have to be used in the function. Let's return to 'fibs'.


2.5 Part V: Making Sense of 'fibs', and Why Lazy Evaluation is Important

We were working on a definition for the list of Fibonacci numbers. Here is the example again:

fibs :: [Int]
fibs = 0 : 1 : [ a + b | (a, b) <- zip fibs (tail fibs)]

So what the heck is '(a, b)' and 'zip fibs (tail fibs)' and all that? Well, Haskell has a more expressive type system than most other languages. As in Python, '(a, b)' is a tuple, meaning two values stuck together. It's a convienent way to store and pass multiple values, much more so than structs. Just add parentheses and enough commas, and you pass the group of values around as you please. The only trick is that Haskell expects you to be consistent, and that means having the right type. The code adds 'a' and 'b' together to get a number in the Fibonacci sequence, so we know that'a' and 'b' are integers. Clearly, '(a, b)' is of type '(Int, Int)', which is stated as follows:

(a, b) :: (Int, Int)

We get these values labeled '(a, b)' from the list defined by 'zip fibs (tail fibs)'. Therefore 'zip fibs (tail fibs)' is of type '[(Int, Int)]', a list of 2-tuples of an Int and an Int. More clearly:

zip fibs (tail fibs) :: [(Int, Int)]

You can use GHCi and Hugs to print these types. The ':t' command, followed by a variable or function, will print its type. The following is at the GHCi prompt with the example file that includes the fibs function loaded:

*Main> :t zip fibs (tail fibs)
zip fibs (tail fibs) :: [(Int, Int)]
*Main>

So what is 'zip'? Its type and general meaning are given here:

The Prelude, section: Zipping and Unzipping Lists

'zip', as its name somewhat implies, takes two lists and 'zips' them together, returning a list of tuples, with the left member of each tuple being an item from the first (left) list, and likewise for the right.

Prelude> zip [ 1, 3, 6, 2 ] [ "duck", "duck", "duck", "goose" ]
[(1,"duck"),(3,"duck"),(6,"duck"),(2,"goose")]
Prelude>

And what about '(tail fibs)'? 'tail' is a pretty straightforward function: it chops off the first item of a list and returns the remainder. That statement can be slightly misleading. 'fibs' doesn't get altered by using 'tail' on it; as I said before, Haskell doesn't have an update operation. Instead, 'tail' just computes the proper result and returns it, rather than altering 'fibs' itself.

Prelude> tail [ 10, 20, 30, 40, 50 ]
[20,30,40,50]
Prelude>

Well, that makes it seem like 'zip fibs (tail fibs)' probably has the right type, but what is it?

fibs :: [Int]
fibs = 0 : 1 : [ a + b | (a, b) <- zip fibs (tail fibs)]

The first paramater to zip is 'fibs', which is the list defined by the expression! What the heck? Can you do that? Yes, you can. See, 'fibs' is the entire list, including the 0 and 1 at the beginning. So the first two tuples in the list created by the zip function will have a 0 and then a 1 on their left. So what is 'zip fibs (tail fibs)'? Well, the first value is definitely (0, 1). Why? Because the first item in fibs is 0, and the first item in (tail fibs) is 1, the second item in fibs. So what's the second value in zip fibs (tail fibs)? It's (1, 1). Where did the right hand 1 come from? It's the third value in fibs, which we just computed. The first value of zip fibs (tail fibs) is (0, 1), which is '(a, b)' in the list comprehension, and so the first value in that comprehension is 0 + 1, or 1. That is also the third value in fibs, etc.

Did you catch all that? The definition of fibs is evaluating itself while it is computing itself. So why didn't some sort of error happen because of undefined values? The trick to all of this is Haskell's laziness. Also, the evaluation is always one step behind the computation, so evaluation can always proceed exactly as far as needed for the next computation. Finally, this list is infinite. Of course, no computer can hold an infinite amount of data. So how much is really there? The answer is simple: until you read some values from fibs and print them, there's only the 0 and the 1, plus the function to generate more of the list. After you read some, fibs will be evaluated to that point, and no further. Since fibs is defined globally, it will remain defined in memory, making reading further values very quick. Try this with Hugs or GHCi and you see'll what I mean.

fibs !! 2
fibs !! 4
fibs !! 30
fibs !! 30
fibs !! 6
fibs !! 20
fibs !! 30
take 10 fibs

'!!' is the 'index' operator for lists in Haskell. It walks down the list and returns the nth item, zero-indexed like C/C++ and Python. 'take 10 fibs' will return the first 10 values of fibs. Be careful, fibs has infinite length. If you just type 'fibs', the output could go forever.

And why does the list only get evaluated as far as you print it? Haskell is 'lazy', meaning that it doesn't do work that it doesn't have to. C programmers know that the boolean operators '&&' and '||' are 'short-circuit', meaning that the right hand side is not evaluated unless it's needed. This allows for all kinds of neat tricks, like not dereferencing a null pointer. The entire language of Haskell has this short-circuit behavior, including the functions that you write yourself. This sounds strange, and will become even more important when we get to Haskell's tools.

This also brings us to one of the other odd things about Haskell: It is often easier to code the general definition for something than to write a function that generates a specific value. This is one of those things you have to get used to, and you will probably come back to it again. And again.

Well, give yourself a pat on the back. If you got all that, or at least you will after playing around in Hugs, then you understand about half of the common usages of Haskell. Ready for the other half? Maybe take a break, and play around a bit in Hugs or GHCi.


3 Section II: Towards Functions

3.1 Part I: The Order of Operations as a Programmer

A programming note for recursive functions and Haskell:

Concerning the fib / fibGen example here:

fib :: Int -> Int
fib n = fibGen 0 1 n

fibGen :: Int -> Int -> Int -> Int
fibGen a b n = case n of
	0 -> a
	n -> fibGen b (a + b) (n - 1)

When I was writing this example, I wrote the type of fib first, then the type and definition of fibGen, then finally the definition of fib.

For those of you who are not accustomed to writing recursive functions, Haskell programming often requires them. Often these recursive functions need subsidiary functions, which either 'frame' the main operation of recursion, or perform a simple task that keeps the main recursion function clean. In either case, the subsidiary functions can be written later, after the major recursive operation is clearly defined including end conditions.

In general, it is a good idea to concentrate on the most crucial aspect of a piece of code when programming, but Haskell's design greatly reinforces this. The definition of subsidiary functions, such as the 'setup' where fib calls fibGen with parameters '0 1 n', can wait until the function itself has been written. This is true even though the type of fib was obvious from the beginning. Likewise, Haskell makes writing trivial functions like that so quick that they can generally be ignored while thinking about the larger picture. These things are likely to change the way that you code, and probably in a good way.


3.2 Part II: Functions, But Really a Sidetrack to Types

As we move on, the other half of Haskell's general usage looms. This half is about functions.

So what is a function? As this tutorial's alignment indicates, we'll compare C/C++ to Haskell. In C, a function is a sequence of commands that have their own namespace, are called during execution and passed parameters, inherit the namespace of the scope in which they are written, and return a value to their caller. In Haskell, most of that is true, except of course functions in Haskell are not sequences of events, but expressions and definitions. There is a major difference between C and Haskell, however, and it concerns the amount of flexibility that functions have.

In C, functions take parameters and return a single value. We've already seen that Haskell has many ways to group values, like several other languages. The two most common of these are lists and tuples, and these can be the return type from a function. To sum them up, in Haskell lists are variable length and hold values of the same type, and tuples are fixed length and can hold values of different types. Here is an example of a function type that returns a tuple:

splitAt :: Int -> [a] -> ([a], [a])

'splitAt' takes an Int and a list and returns a tuple. The left value in the tuple is the first n values in the list, and the right value is the rest of the list, where n is the first parameter. This function is in the Prelude, and its description can be found here:

The Prelude, section: Sublists

We've already seen lists in a type:

fibs :: [Int]

Since the fibonacci numbers grow rapidly, but 'Int' is 32-bit, it would probably have been better to use 'Integer', Haskell's built-in infinite-precision integer storage.

fibs :: [Integer]

And this is a function type. The function takes zero parameters and returns a list of Integers. This isn't a trick of Haskell's syntax. 'fibs' really is a function that, when evaluated will return numbers comprising the fibonacci sequence. That kind of logic is what lets Haskell's compilers produce code which runs quickly and lets Haskell programmers write code efficiently.


3.3 Part III: More Types, Because Haskell Is 'Polymorphic'

It's time for a brief [not so brief] digression about types. As you've noticed, the trend seems to be to call everything a 'function'. And that's true. Take '4' for example. When you use a number '4' hardcoded into your code, it looks to you like the number 4. But what is it to Haskell? Type ':t 4' into Hugs or GHCi. What do you get? You get some weird junk:

Prelude> :t 4
4 :: (Num t) => t
Prelude>

That looks like it's a function that's taking a parameter. It's not, and the key is the '=>' arrow rather than the '->' arrow. The type given is read as follows: "four is of type 'a', where 'a' is in the class 'Num'." What's the class 'Num'? Well, it's the class that all numbers belong to. The real answer is that Haskell has something C doesn't: true polymorphism.

This is an important term and it needs some illustration. Most C++ programmers are familiar with the term 'overloading', which means that a function is defined for more than one set of parameter types. For instance, addition and multiplication in C/C++ are overloaded, allowing the following combinations to occur:

int a = 4, b = 5;
float x = 2.5, y = 7.0;
	
cout << a + b;	//9
cout << a + y;	//11
cout << y + a;	//11.0
cout << x + y;	//9.5
cout << b * a;	//20
cout << b * x;	//12.5
cout << y * b;	//35.0
cout << x * y;	//17.5

In C/C++, this is accomplished by defining all of the following overloaded definitions for '+' and '*':

operator+ (int, int);
operator+ (int, float);
operator+ (float, int);
operator+ (float, float);
operator* (int, int);
operator* (int, float);
operator* (float, int);
operator* (float, float);

The C compiler picks the appropriate type at compile time. The key distinction between polymorphism and overloading is that in C/C++, which uses overloading, each function above must be written separately. In C/C++, any function besides those above that uses either an int or a float must specify which one it expects, or must itself be overloaded. This bring us to the idea of classes.

For what types is '+' defined in C/C++? It is possible to overload the operator for new types defined by a user, but those new types will not be interchangeable with ints, floats, or other numeric types. What this means is that existing sort functions such as mergeSort and quickSort would need to be rewritten to sort values of the new type. In constrast, here is the type of mergeSort in Haskell:

mergeSort :: Ord a => [a] -> [a]

What is going on? Again, there are two parameters listed, not three. The first thing that appears to be a parameter is actually a class restriction. 'mergeSort', as you would expect, takes a list of objects of some type (type 'a'), and returns a list of objects of the same type. So why is the following type not sufficient?:

mergeSortBadType :: [a] -> [a]

The reason this is insufficient is that at some point in mergeSort the items in the list will need to be compared to each other. This will be done using a comparison operator such as '>', '<', '>=', or '<='. In Haskell, those operators are part of a class definition. The values for which '>' and so on are defined are those which are members of class 'Ord', so named because an 'order' can be determined for them. Many numeric types are of type Ord, as are characters (type 'Char') and strings (type 'String'). mergeSort needs to compare the items in its inputs, so mergeSort must clarify its type. Its parameter must be a list of objects for which '<' and so on are defined. It would also be okay to make the type more specific, but this is unnecessary and generally a bad technique.

And what about '4'? How come four is of type 'a', where 'a' is a member of class 'Num'? Can't it just be a Num? Or an Int? It can be an Int if we specifically say it is, like so:

a = (4 :: Int) + 2

Here '4' is an Int. That is how you specify the type of something inside of an expression. But without that, 4 is of type 'a', where 'a' is in class 'Num', or more simply, 4 is of type 'a' in class 'Num'. And that is important, because '+' is defined for all member types of class Num, meaning that '4' is definitely a legal parameter for this function:

doubleIt :: Num a => a -> a
doubleIt n = n + n

'-' and '*' are also defined for all member types of Num, so 4 is also allowed for this function:

fibPoly :: (Num a, Num b) => a -> b
fibPoly n = fibGenPoly 0 1 n

fibGenPoly :: (Num a, Num b) => b -> b -> a -> b
fibGenPoly a b n = case n of
	0 -> a
	n -> fibGenPoly b (a + b) (n - 1)

That is our first Haskell fib function, but with the types changed. The names have an added 'Poly' so that an error doesn't occur in the example files because of a reused name. The type of 'fibPoly' is read, "fibPoly is of type 'a' to 'b', where 'a' is a member of class Num and 'b' is a member of class Num." There is only one '=>' arrow because there is only ever one section of the type that describes class restrictions. The parentheses are required.

Why would we do this? Shouldn't we pick a single type for b rather than use a class? Here's an example. What if you worked on a group project, and two people need to calculate fibonacci numbers? And for reasons of their own, one needed an Int returned and the other needed an Integer? Or a Double? Would you write the code twice with different types? If you were using C you would. You'd have to. Using general type classes allows code reuse in a way that is impossible in other languages.

Also notice that in the initial call to 'fibGenPoly', the third parameter is 'n', the first parameter of 'fibPoly', and that the types of 'fibPoly' and 'fibGenPoly' seem to make note of this. The reason I wrote 'fibPoly' with a different return type from its parameter is that the following would be common:

fib :: Int -> Integer

We only need Int-sized storage of our counter input, but we may need Integer-sized storage of the result. Using two separate types allows this. Also, carefully check how types flow in 'fibGenPoly'. The math does not mix parameters of type 'a' and 'b', and a parameter of type 'b' is also used as the final return value. The types match not only externally but internally. Following types through code in this manner will be important for debugging.

Continuing onward, in the fib example we used 'tail'. Here is its type:

tail :: [a] -> [a]

In C, tail would have to be reimplemented for every type of list you used. That sounds like a slightly contrived problem, so what about '!!', the index operator? In most other languages, indexing a list is builtin operator, because it has to work for every kind of array. So it's not actually a function. And so on. Everything in C is either overloaded, built in, or works for only one type. There are a few exceptions, generally involving casting to or from '(void *)', but those are far outside the scope of this tutorial.

The point is, you're going to see 'Num a =>' at the beginning of type signatures, as well as 'a' and 'b' inside type signatures. Here, 'a' and 'b' are type variables, used by the compiler solely to determine proper types for compilation. Occasionally you will get messages such as 'can't determine type', or 'type mismatch'. The second means the you've done something wrong, but the first usually means that a type variable can't be pinned down to a single type for a function that you've written. This can happen for the simplest of reasons:

main = putStrLn (show 4)

Previous versions of GHC would not accept this. Here's why: 'putStrLn' takes a string and puts it on the screen. 4 has a 'polymorphic' type, i.e. it is a member of a type class, not defined as a specific type. 'show' takes anything that can be turned into a string (basically), and so it doesn't specify a type for '4' either. This leaves the compiler in a quandry, because no specific type is indicated anywhere, and it will complain. To resolve it, add the type definition like so:

main = putStrLn (show (4 :: Int))

Or Integer, or Double, or whatever. This will be handy when you try to test generalized functions, and you'll need it in a few other weird cases as well.

One last note. You can define the type of multiple functions simultaneously:

addOne, subtractOne :: Int -> Int

This can be handy.


3.4 Part IV: Functions Already

But we were talking about functions. As you may have noticed, it seems like anything can work as a parameter or return value for a function. This is absolutely true, as long as the types match. For instance, let's take a look at the extremely useful 'map' function:

map :: (a -> b) -> [a] -> [b]

By now you can probably read that, strange as it may be. "map is of type function a to b followed by a list of type a and returns a list of type b". It's taking a function as a parameter. Not only that, but a polymorphic function, with no type restrictions. And look at the other two items. The function it takes is from a to b, and then it takes a list of type a and returns a list of type b. With a name like 'map', it's pretty clear what should happen when you use it:

fooList :: [Int]
fooList = [3, 1, 5, 4]
	
bar :: Int -> Int
bar n = n - 2
*Main> map bar fooList
[1,-1,3,2]
*Main> 

Nothing to it. In the past a type for at least 'fooList' or 'bar' would have been required or Hugs and GHC would have complained that the types could not be fully determined. 'map' is in the Prelude, and its description can be found here:

The Prelude, section: List Operations

The example using 'map' shows that you can write functions which take functions as parameters. This can be fun and very useful. Now let's try something stranger:

subEachFromTen :: [Int] -> [Int]
subEachFromTen = map (10 -)

What the heck? First, for this to work there do need to be parentheses around the '-' and the '10'. Second, what does this do? We'll take this one step at a time again. '(10 -)' is a function. It takes a number and returns ten minus that number. Use ':t' in Hugs or GHCi to find out its type:

*Main> :t (10 -)
(10 -) :: (Num t) => t -> t
*Main>

Second, 'map' takes a function as its first parameter. There's a reason that Haskell uses arrows to define types, rather than a parenthesized list. 'map', applied to '(10 -)' has the following type (again, check in Hugs and GHCi):

*Main> :t map (10 -)
map (10 -) :: (Num t) => [t] -> [t]
*Main> 

It takes a list of Num members (all the same type of Num members, mind you) and returns a list of the same type (again, the same type member of Num). This is constrained to [Int] -> [Int] by subEachFromTen's type signature. Applying the 'map' function to less than its full list of arguments like this is called 'partial evaluation'. You take a function, give it some of its parameters, and you've got a function that takes the rest of the parameters. You can even name this 'in-between' state, since it is just a function like anything else. Here is 'subEachFromTen' in action:

*Main> subEachFromTen [4, 6, 7, 11]
[6,4,3,-1]
*Main>

It does what you think it should, given how I named it. Remember that applying subEachFromTen to a list, even a named list, does not change that list, but merely returns the result.

Take some time now to play around with partial evaluation, in addition to the list functions mentioned before and list comprehensions. Remember that functions grab their parameters 'eagerly', so you have to put parentheses around any parameters that are composed of a function with its own parameters.

4 Section III: Now Let's Really Write Functions

4.1 Part I: Did You Take That Break? Here Are Patterns

Hopefully you are now comfortable defining and using functions in Haskell using your choice of Hugs, GHC, or GHCi. Now it's time to talk about all the ways in which functions can be defined. If you are not ready, re-read some of the earlier material and set up one of the programs for using Haskell. The rest of this will only help you if you're trying it for yourself.

Each function has a type. Even functions with no parameters each have a type, i.e. global and local variables. It is not always necessary to write this type, as Haskell compilers can determine it. However, writing it is a good practice and sometimes a necessity. We are about to cover a lot of syntax, so after reading this section, it would be a good idea to read the Tour of the Haskell Syntax listed in the introduction.

There are a few examples we can go through to make that page clearer. First, a simple function that walks down a list and sums its members:

sumAll :: Num a => [a] -> a
sumAll (x:xs) = x + sumAll xs
sumAll [] = 0

This is a recursive function. It takes a list of 'a' in Num and returns an 'a'. However, there seem to be two definitions for sumAll. And there are. This is how pattern matching works. The two definitions have different specifications for their parameters, and each time sumAll is called, whichever pattern matches the parameters will be evaluated. Let's look at each definition. The second definition is the clearest. '[]' is the empty list, and sumAll of an empty list is defined here as zero. The middle line has something odd about it, though. '(x:xs)' is listed as a parameter, as if we were trying to stick something on the front of a list. In essence we are, because this pattern takes apart its input. There are a few patterns which do this, and this feature of Haskell makes lists very easy to use. To restate, when '(x:xs)' is written as a parameter in a function definition, it will only match lists which have an item stuck on the front. In other words, it will match lists with at least one member. The choice of variable names 'x' and 'xs' is completely arbitrary, but since 'x' will be 'bound' to the first member of the list and 'xs' will be 'bound' to the remainder of the list, it is somewhat natural to write one 'x', and then the remaining 'xs' when describing the list.

When the pattern 'sumAll (x:xs)' matches the input, that is, when 'sumAll' is called with a list that has at least one value, the first definition will be evaluated. This will return the result of adding said first value to the result of calling sumAll on the rest of the list. Patterns are checked top-to-bottom, so when 'sumAll (x:xs)' fails to match, 'sumAll []' is checked. Since the only pattern that could fail to match 'sumAll (x:xs)' is an empty list, 'sumAll []' will definitely match, and it returns zero. This is the end condition for the recursion.

This sort of function is very common in Haskell. Pattern matching lets us avoid complicated 'switch' statements or 'if' blocks in favor of simply writing separate definitions for distinct inputs. This allows code to be clear and concise. Patterns can also be more specific. The fib example can be rewritten as follows:

fibPat :: (Num a, Num b) => a -> b
fibPat n = fibGenPat 0 1 n

fibGenPat :: (Num a, Num b) => b -> b -> a -> b
fibGenPat a _ 0 = a
fibGenPat a b n = fibGenPat b (a + b) (n - 1)

Again, the names are changed to avoid conflicts in the examples, adding 'Pat'. Here a literal ('0') is used to match a pattern, and that is fine, too. Also note the underscore ('_') in the first definition. The underscore matches anything, just like a variable name, but does not get 'bound' to its parameter. This can make code clearer when a parameter is not used. In this case, we do not care about the second parameter, since we are only matching against the third and returning the first.


4.2 Part II: After Patterns, Guards

Sometimes functions need more complicated work to choose between definitions. This can be done with guards, shown here:

showTime :: Int -> Int -> String
showTime hours minutes
	| hours == 0	= "12" ++ ":" ++ showMin ++ " am"
	| hours <= 11	= (show hours) ++ ":" ++ showMin ++ " am"
	| hours == 12	= (show hours) ++ ":" ++ showMin ++ " pm"
	| otherwise 	= (show (hours - 12)) ++ ":" ++ showMin ++ " pm"
	where
	showMin
		| minutes < 10	= "0" ++ show minutes
		| otherwise		= show minutes

Ignore the lines after the 'where' for the moment. 'showTime' has only one definition clause, but it is broken up into three guards. Each guard has a boolean expression, and they are checked in order. The first which is found to evaluate to True will have its corresponding expression evaluated and returned. The function will be equal to that expression for the case that that guard is true. 'otherwise' is equal to True, and will therefore always be accepted if it is reached. 'otherwise' is a convenience, not a necessity. '++' is the list concatenation operator. It is used here for the following reason:

String :: [Char]

So all of that makes sense except for the 'where' stuff. Here I threw in something extra, and used a 'where' clause to define a local function, in this case a variable called 'showMin'. 'showMin' is a variable in the traditional sense, but it is also a function here in Haskell, so instead of an 'if' statement or 'case' statement, I used guards again to describe its two definitions.

In all, this function takes an hour (hopefully from 0 to 23) and minutes (hopefully from 0 to 59) and prints the time from them. 'showMin' is a local variable/function, defined in the where clause. Guards are used both to define 'showTime' and 'showMin'.

It is important to note that the variables defined in 'where' clauses and their cousins, 'let' clauses, are only in scope for the pattern in which they exist. So a function defined with multiple patterns can't use a 'where' clause to define a local variable across all of them.


4.3 Part III: 'If'

I mentioned 'if' statements just above, and Haskell has them, but they are always if-then-else. Haskell doesn't have much use for a function that doesn't return some kind of value, so 'if-then' doesn't work. That said, here's a simple example:

showMsg :: Int -> String
showMsg n = if n < 70 then "failing" else "passing"

Not much to it. Since 'showMsg' has a return type of String, the values in both the 'then' clause and the 'else' clause have to also be of that type. 'if' does not need to be the entire definition of a function. For instance:

showLen :: [a] -> String
showLen lst = (show (theLen)) ++ (if theLen == 1 then " item" else " items")
	where
	theLen = length lst


4.4 Part IV: Indention Syntax

You may have noticed that I use indention to indicate lines that are part of a block of source code. This is not simply good style, but it is also part of Haskell's syntax. Indentation denotes structure. Specifically, changing the indentation from one line to the next indicates that a block of code has begun or ended. Also, Haskell will not let you place the definitions for two functions one line after another. Instead it demands a blank line to indicate that the definition of a function has truly finished. This all enforces good style, and it greatly reduces the amount of junk syntax in the language, such as '{', '}', and ';'.


4.5 Part V: And Lambda Functions

Given a programmer's ability to define functions locally in Haskell, you might ask, "Is there an even more brief way of defining functions?" Unlike this tutorial, Haskell does let you finish quickly:

(\x y -> x * 4 + y) :: Num a => a -> a -> a

What is this you ask? It all starts with the '(\' right at the beginning. '\', you see, looks a little like the greek letter lambda, which happens to be the symbol used for Haskell itself:

http://www.haskell.org/

'Lambda calculus' is a branch of mathematics that deals with functions created on the fly. There's a lot more to it, such as its place in computation theory, etc. You can find out more about it other places. The point is that when you write '(\', you have started a 'lambda expression', which generally look a lot like the one above. It has '(\', then some variable names (usually short ones), a '->' arrow, and an expression which uses those variables. And of course a ')'. I'm sure you can figure out what this does. It defines a function and uses it right where it is defined. For example:

Prelude> map (\x -> "the " ++ show x) [1, 2, 3, 4, 5]
["the 1","the 2","the 3","the 4","the 5"]
Prelude>

I mention lambda functions because they come in handy, but also because they are used often in the more complicated examples. Of course, lambda functions cannot be recursive, since they would need a name in order to refer to themselves. So they are good for those cases where a function is needed only once, usually to define another function.


4.6 Part VI: Polymorphic Types and Type Constructors

The simplest Haskell program is the following:

main = return ()

The variable 'main' is a reserved word, and whatever its value is, the program executes. Here is the obligatory "Hello World" program:

main = putStrLn "Hello World"

And the type of 'main':

main :: IO ()

This is another way of saying "main is of type something or other". Well, that's how it looks anyway. What we have here are examples of two strange things at once. That happens a lot when you're learning Haskell, and that's why I'm writing such a long tutorial in the first place. '()' is a type. The only value of type '()' is written '()'. Another word for this type and its singular value is 'null'. So this type can be read "main is of type IO null". But why is it read that way? What does it mean to put one type after another? IO is not a type. The word 'IO' is only part of a type. 'IO a' is a type. 'IO' is a type constructor which takes a type as a parameter. Deep breath.

This is not the same thing as a class. When we were talking about classes, I said that functions were polymorphic, meaning that they could operate on values of any type provided that the proper functions were defined for it. You can create a type and make it a member of Num, as long as it has '+', '-', and '*' defined for it, as well as having equality defined. If you do that correctly, any function which has 'Num a =>' at the beginning of its type will accept your new type and everything will work fine. But 'IO' isn't a class, or a polymorphic function. This is something stranger. It is a polymorphic type.

Did that make sense? A type that takes another type as a parameter? Let's look at an example from the standard libraries:

data Maybe a = Nothing | Just a

This type can be found here:

http://www.haskell.org/ghc/docs/latest/html/libraries/base/Data-Maybe.html

This is a 'data' type definition. The values on the right are separated by '|', the pipe symbol, which can be read here as "or". This type says, "a value of type 'Maybe a' can be 'Nothing', or can be 'Just a'", that is 'Just' followed by a value of type 'a'. Here's an example using Maybe in pattern matching:

showPet :: Maybe (String, Int, String) -> String
showPet Nothing		= "none"
showPet (Just (name, age, species))	= "a " ++ species ++ " named " ++ name ++ ", aged " ++ (show age)

'showPet' has two patterns. The first matches a value of 'Nothing', the first 'data constructor' for Maybe. There are no variables after 'Nothing', just as there are no types listed after 'Nothing' and before the '|' in the type definition for 'Maybe a'. The second matches a value of 'Just', the second 'data constructor' for Maybe. 'Just' does have a tuple after it, just like in the type definition, and parentheses are used to group these two things in the pattern. The words 'Just' and 'Nothing' are arbitrarily chosen, although good choices. It is important that the first letter is capitalized. As you may have noticed, throughout Haskell, variables have lowercase first letters and types have uppercase first letters. This is a part of Haskell's syntax. Data constructors are not variables, and so the convention is extended to require their first letters to also be capitalized.


4.7 Part VII: The IO Monad

So let's get back to 'main' and its type, 'IO ()'. IO is a polymorphic type. But what is it? Unfortunately, I cannot show you its definition. 'IO a' is a part of the Haskell standard, and most of its implementation is 'under the hood'. Its implementation is also very large, and involves a lot of low-level code. 'IO' is a member of the class Monad, which means (very briefly) that it lets you write sequential-looking code using the 'do' notation which I'll show in minute. 'IO' is also short for Input/Output of course, and that means that the 'under the hood' functions of IO let you read and write the world state. So 'IO' is how Haskell interacts with the real world. Again:

main :: IO ()

Monads are odd things, and a function of type 'IO a' will perform an 'IO' operation and return a value of type 'a'. In this case, main will return a value of type '()', namely '()', or 'null'. So main is always "an IO operation that returns null". Here is short main function, with types. Note that it is never necessary to specify the type of main.

someFunc :: Int -> Int -> [Int]
someFunc .........

main = do
	putStr "prompt 1"
	a <- getLine
	putStr "prompt 2"
	b <- getLine
	putStrLn (show (someFunc (read a) (read b)))

Here's what's going on: The IO Monad binds blah blah blah that's all in the other tutorials and yet you're reading this. Let's try again.

Here's what's going on: All of that 'variable equal at all times' stuff I mentioned in previous sections doesn't work with I/O. After all, you have to call 'getLine' several times, and it doesn't always return the same value. You can't say that the function 'getLine' is "equal" to anything, since its value may be different every time it is referenced. This is in contrast to a value, like '4', which will always be the number 4. So IO is hard to do if you want a language that has true "equality" when you write an assignment. This has actually been the biggest stumbling block for functional language design since the idea of functional languages arose. Haskell has a solution.

Most functional languages break their 'functional model' to handle IO, but of course Haskell does something weirder. There's an obscure branch of mathematics, namely monads, that concerns state transformation functions. The authors of Haskell used it to let you write mathematical functions that denote changes in the world's state without breaking that nice equality stuff. [How's that for simple? See how I made it all make sense by appealing to a higher authority? Haskell works by magic! -Eric]

Briefly, the IO monad takes the state of the world, alters it with some function, and passes it forward to be the input for another IO function. In the example, the functions 'putStr', 'getLine', and 'putStrLn' are members of IO. 'Binding' them to 'main' by using the 'do' notation means that when 'main' is evaluated, they will be evaluated in order, just like you'd expect. Thus the main function of the program above will put a prompt on a screen and read in a string, and then do it again, then print something. The '(read x)' function takes a string and returns a number of whatever type is needed, assuming the string parses. The 'show' function will take the result from 'someFunc' and turn it into a string so that putStrLn can display it. I will return to this code and make it useful later.

The description of IO may sound like 'imperative' behavior, and it is, with a twist. Input and output need to happen in a certain sequence in a program, but a mathematical function's parts can be determined in any order as long as it eventually finishes. Sequence doesn't matter for the evaluation of a mathematical definition, and most Haskell code is written in that manner. So what about input and output? Can they be mathematically described? As it turns out, they can. Monad theory means that a function can be "equal" to the act of transforming a state. The mathematical framework for monads captures the idea of sequenced actions, and that let Haskell's creators give 'IO a' a type that could be used just like any other type. When a sequence of monad functions are evaluated, for instance when 'main' is evaluated, the monad functions are applied in sequence to the initial state. Before that evaluation, Haskell gets to treat functions of type 'IO a' just like anything else, i.e. as expressions waiting to be evaluated. That means that the 'IO a' type doesn't have to be a special, builtin part of Haskell, even though input and output have to be part of the standard libraries and implemented with system calls. It does mean that doing IO in Haskell means using a polymorphic type as well as a mathematical theory that is rather obtuse. That's why IO is so far down in this tutorial.

All Haskell functions carry around this mathematical framework ("equality") and unless otherwise noted, they are lazy. This means that the only thing that forces evaluation is binding an 'IO monad' function to 'main'. Nothing is ever evaluated unless it is going to be the return value of an evaluated 'monad' function, or unless it is needed to compute such a return value. That the 'IO monad' forces evaluation isn't really important, but it will explain some odd behavior if you start doing "out there" coding in Haskell. There are eager, or 'strict' functions in Haskell which, when evaluated, will first fully evaluate their parameters. These are usually marked in the documentation.

It is possible to use the IO monad, illustrated above, to write imperative-style programs in Haskell. That is a direct result of the behavior of the IO monad. Doing so would be inappropriate, because Haskell has much more power than that. I say that because I have not seen any other tutorial do so, and I think it is important.

If you write programs that call outside libraries, you'll deal with the IO monad a lot. Everything that deals with the rest of the computer is part of the IO monad, such as driver calls, network libraries, file access, threading, and system calls. There are other monads in Haskell's libraries, and you can also write your own. Writing your own monad for a major project will probably be the other hard thing you need to do to fully understand Haskell, after understanding 'foldr' and variants. It's pretty hard, but not because it's complicated. Writing your own monad is hard because there's so little to do that you'll have to work to understand why that little bit of code is all you need. This tutorial ends with an extended example that demonstrates how to write a monad from scratch, but it may not be necessary for a general Haskell programmer to learn that.


4.8 Part VIII: Dissecting the IO Example

Let's return to basic IO monad usage. Here's the example main program again:

someFunc :: Int -> Int -> [Int]
someFunc .........

main = do
	putStr "prompt 1"
	a <- getLine
	putStr "prompt 2"
	b <- getLine
	putStrLn (show (someFunc (read a) (read b)))

This is a general framework for learning Haskell. Aside from altering the number of parameters to 'someFunc' (perhaps zero), this is all you will really need for a main function until you feel comfortable with Haskell. It is good enough for most simple tasks, and you can use it to test out ideas in GHC by replacing the definition of 'someFunc' with whatever function you're trying to write. You won't need it for working in Hugs or GHCi, but you will if you compile with GHC. In Hugs and GHCi, you only need to make a source code file that includes someFunc's definition.

What I said earlier about the indentation technique removing extraneous syntax isn't quite true. '{', '}', and ';' do exist in Haskell. They are an alternative to the the whitespace-defining notation used here, and are sometimes but rarely preferable. The whitespace method is very simple, and the example shows most of its syntax. Blocks begin on tabbed or spaced lines and further indention is used for separate blocks. Using 'do' signals to the compiler to expect indentation based on whitespace unless you then use '{', '}', and ';'.

The '(read x)' items use the 'read' function found here:

The Prelude: the 'read' function

'someFunc' is whatever you want to test. Since its return value is a parameter to 'show', 'someFunc' can be defined with a variety of return types, such as 'Int', '[Int]', 'String', or even '(String, [Int])'. The type of 'show' is given here:

The Prelude: the 'show' function

'show' is a class method defined for members of the class 'Show'. This is just like how '+' and '*' are class methods defined for members of 'Num'. You can see those here:

The Prelude, section: the Num class

The types of the IO functions, specifically 'putStr', 'getLine', and 'putStrLn', are given here:

System.IO, section: Special cases for standard input and output

and also here, in the Prelude, which is why they are in scope normally:

The Prelude, section: Simple I/O operations

As you can see from the documentation, when putStrLn is applied to a String, you get a value of type 'IO ()'. The 'null' means that no useful information is returned by the function. It altered the state of the IO monad by putting something on the screen, and no value comes back from it.

The '<-' arrow is used to bind the result of an IO operation to a variable. 'getLine' has a type of 'IO String'. The 'do' notation says that a monad function like 'getLine' can be prefaced by a variable name and the left arrow. That variable name comes into scope at that point in the function and when the monad function is evaluated, and its return value will be assigned to the variable. If a function is not prefaced by a left arrow and a variable, then its return value is ignored, although whatever that function did to the state carried by the monad still happens. For instance, if you do not bind the return value of 'getLine' to a variable, you don't store the line the user typed in, but it is still read in and buffers are messed with, etc., meaning that another getLine won't return it.

There is one exception to this 'ignoring' of return values not bound to variables. It is no accident that the last line in the sequence has the same return type as main. When using monads to compose sequences of operations, the last line in a function must have the same IO type as the function itself. When this is not natural, use 'return' with an appropriate value:

getName1 :: IO String
getName1 = do
	putStr "Please enter your name: "
	name <- getLine
	putStrLn "Thank you.  Please wait."
	return name

'putStrLn' has a type String -> IO (), but IO () doesn't match the type of 'getName', which is IO String. 'return name' is used to end the function with a function of the proper type, and to return the data we wanted to return. Without the reply message, this function can be written much more succintly:

getName2 :: IO String
getName2 = do
	putStr "Please enter your name: "
	getLine

As you can see, calling an IO function which returns a value is the same as binding that value to a variable and then returning that value.

This whole monad issue looks imperative, and in some ways, it is. Once you call 'someFunc', you get away from all that, and Haskell's laziness and equality become the norm. Is all of this necessary? Is it a good idea to have imperative-looking code calling lazy, functional code? In my opinion, it can be. You get to specify the order of execution for those operations that require it (such as IO), and you get powerful functional tools the rest of the time. You also get a hurt head. So take a break. I need one from writing all of this. The next section will be on advanced type declarations.

5 Section IV: Haskell and You

6 Section V: Final Commentary

7 Section VI: Extended Examples