Haskell computations produces a lot of memory garbage - much more than conventional imperative languages. It's because data are immutable so the only way to store computation result is to create new data. In particular, every iteration of recursive computation creates new data. But GHC is able to efficiently manage garbage collection, so it's not uncommon to produce 1gb of data per second (most part of which will be garbage collected immediately). So, you may be interested to learn how GHC does such good job.
1 Motivating examples
Through the article, we will use two motivating examples. First one just computes some value:
factorial 0 acc = acc factorial n acc = factorial (n-1) $! (acc*n)
Note that we have used accumulator with strict evaluation in order to suppress default laziness of Haskell computations - this code really computes new n and acc on every recursion step.
Our second example produces large list:
upto i n | i<=n = i : upto (i+1) n | otherwise = 
So, we had two examples - one that produce just one value but leaves a lot of garbage and another producing a lot of live data.
2 Garbage collection
Haskell computation model is very different from that of conventional mutable languages. Data immutability forces us to produce a lot of temporary data but it also helps to collect this garbage rapidly. The trick is that immutable data NEVER point to younger values. Indeed, younger value don't yet exists at the time when old value created, so it cannot be pointed to from scratch. And since values are never modified, it neither can be pointer to later. It is the key property of immutable data.
This greatly simplifies garbage collection (GC) - anytime we can scan last values created and free those of them that are not pointed from the same set (of course, real roots of live values hierarchy are live in stack). It is how things work: by default, GHC uses generational GC. New data are allocated in 512kb "nursery". Once it's exhausted, "minor GC" occurs - it scans nursery and frees unused values. Or, to be exact, it copies live values to the main memory area. The less values that survived - the less work to do. If you have, for example, recursive algorithm that quickly filled all the nursery with generations of its induction variables - only last generation of the variables will survive and be copied to main memory, the rest will be not even touched! So it has a counter-intuitive behavior: the larger percent of your values are garbage - the faster it works.