Personal tools

Memory leak

From HaskellWiki

(Difference between revisions)
Jump to: navigation, search
(detection of a memory leak)
m (Grammar)
(8 intermediate revisions by 2 users not shown)
Line 3: Line 3:
 
A garbage collector can reliably prevent [http://en.wikipedia.org/Dangling_pointers dangling pointers],
 
A garbage collector can reliably prevent [http://en.wikipedia.org/Dangling_pointers dangling pointers],
 
but it is easily possible to produce memory leaks, especially in connection with [[lazy evaluation]].
 
but it is easily possible to produce memory leaks, especially in connection with [[lazy evaluation]].
  +
Note that a leak will not only consume more and more memory but it will also slow down the [[garbage collector]] considerably!
  +
Maybe it is even the reason for the widely spread opinion
  +
that garbage collectors are slow or not suited for [[realtime]] applications.
  +
  +
== Types of leaks ==
  +
  +
=== Holding a reference for a too long time ===
   
 
Consider for example:
 
Consider for example:
Line 9: Line 16:
 
in sum xs * product xs
 
in sum xs * product xs
 
</haskell>
 
</haskell>
Since most Haskell compilers expect, that the programmer used <hask>let</hask> in order to share <hask>xs</hask> between the call of <hask>sum</hask> and the call of <hask>product</hask>,
+
Since most Haskell compilers expect that the programmer used <hask>let</hask> in order to share <hask>xs</hask> between the call of <hask>sum</hask> and the call of <hask>product</hask>,
the list <hask>xs</hask> is completely materialized and hold in memory.
+
the list <hask>xs</hask> is completely materialized and held in memory.
However, the list <hask>xs</hask> is very cheap to compute, and thus it would reduce memory usage considerably,
+
However, the list <hask>xs</hask> is very cheap to compute, and thus memory usage can be reduced considerably by computing <hask>xs</hask> in each call.
if <hask>xs</hask> is recomputed for both calls.
+
Since we want to avoid code duplication,
+
To achieve this while avoiding code duplication, we can turn the list definition into a function with a dummy argument.
we like to achieve this by turning the list definition into a function with a dummy argument.
 
 
<haskell>
 
<haskell>
let makeXs _ = [1..1000000::Integer]
+
let makeXs n = [1..n::Integer]
in sum (makeXs False) * product (makeXs True)
+
in sum (makeXs 1000000) * product (makeXs 1000000)
 
</haskell>
 
</haskell>
   
A memory leak will not only consume more and more memory but it will also slow down the [[garbage collector]] considerably!
+
=== Building up unevaluated expressions ===
  +
  +
Another typical cause of memory leaks are unevaluated expressions,
  +
the classical example being to sum up the numbers of a list (known as <hask>sum</hask> function).
  +
<haskell>
  +
foldl (+) 0 [1..1000000::Integer]
  +
</haskell>
  +
The problem is, that the runtime system does not know, whether the intermediate sums are actually needed at a later point,
  +
and thus it leaves them unevaluated.
  +
I.e. it stores something equivalent to <hask>1+2+3+4</hask> instead of just <hask>10</hask>.
  +
You may be lucky that the [[strictness analyzer]] already removes the laziness at compile time,
  +
but in general you cannot rely on it.
  +
The safe way is to use [[seq]] to force evaluation of intermediate sums.
  +
This is done by <hask>foldl'</hask>.
  +
<haskell>
  +
foldl' (+) 0 [1..1000000::Integer]
  +
</haskell>
   
 
== Detection of memory leaks ==
 
== Detection of memory leaks ==
Line 26: Line 33:
 
and then run the compiled program with restricted heap size.
 
and then run the compiled program with restricted heap size.
 
E.g. you can restrict the heap size to 4 MB like in this example:
 
E.g. you can restrict the heap size to 4 MB like in this example:
  +
 
<code>
 
<code>
 
$ ./mytest +RTS -M4m -RTS
 
$ ./mytest +RTS -M4m -RTS
 
</code>
 
</code>
   
  +
== A note on GHCi ==
  +
  +
If you are noticing a space leak while running your code within GHCi, please note that interpreted code behaves differently from compiled code: even when using `seq`.
  +
  +
Consider starting ghci as follows:
  +
  +
<code>
  +
$ ghci -fobject-code
  +
</code>
  +
  +
For this, see [http://www.haskell.org/ghc/docs/6.12.2/html/users_guide/ghci-obj.html Compiling to object code inside GHCi].
  +
  +
== See also ==
  +
  +
* Blog post [http://blog.ezyang.com/2011/05/anatomy-of-a-thunk-leak/ "Anatomy of a thunk leak"]
  +
* Blog post [http://blog.ezyang.com/2011/05/space-leak-zoo/ "Space leak zoo"]
  +
* Haskell Cafe on a space leak caused by the garbage collector that [http://www.haskell.org/pipermail/haskell-cafe/2010-June/079444.html did not recognize a selector-like function call]
  +
* Haskell libraries on [http://www.haskell.org/pipermail/libraries/2010-September/014420.html Make lines stricter to fix a space leak]
   
 
[[Category:Glossary]]
 
[[Category:Glossary]]

Revision as of 07:50, 27 February 2013

A memory leak means that a program allocates more memory than necessary for its execution. Although Haskell implementations use garbage collectors, programmers must still keep memory management in mind. A garbage collector can reliably prevent dangling pointers, but it is easily possible to produce memory leaks, especially in connection with lazy evaluation. Note that a leak will not only consume more and more memory but it will also slow down the garbage collector considerably! Maybe it is even the reason for the widely spread opinion that garbage collectors are slow or not suited for realtime applications.

Contents

1 Types of leaks

1.1 Holding a reference for a too long time

Consider for example:

let xs = [1..1000000::Integer]
in  sum xs * product xs
Since most Haskell compilers expect that the programmer used
let
in order to share
xs
between the call of
sum
and the call of
product
, the list
xs
is completely materialized and held in memory. However, the list
xs
is very cheap to compute, and thus memory usage can be reduced considerably by computing
xs
in each call.

To achieve this while avoiding code duplication, we can turn the list definition into a function with a dummy argument.

let makeXs n = [1..n::Integer]
in  sum (makeXs 1000000) * product (makeXs 1000000)

1.2 Building up unevaluated expressions

Another typical cause of memory leaks are unevaluated expressions,

the classical example being to sum up the numbers of a list (known as
sum
function).
foldl (+) 0 [1..1000000::Integer]

The problem is, that the runtime system does not know, whether the intermediate sums are actually needed at a later point, and thus it leaves them unevaluated.

I.e. it stores something equivalent to
1+2+3+4
instead of just
10
.

You may be lucky that the strictness analyzer already removes the laziness at compile time, but in general you cannot rely on it. The safe way is to use seq to force evaluation of intermediate sums.

This is done by
foldl'
.
foldl' (+) 0 [1..1000000::Integer]

2 Detection of memory leaks

A memory leak can be detected by writing a test that should require only a limitted amount of memory and then run the compiled program with restricted heap size. E.g. you can restrict the heap size to 4 MB like in this example:

$ ./mytest +RTS -M4m -RTS

3 A note on GHCi

If you are noticing a space leak while running your code within GHCi, please note that interpreted code behaves differently from compiled code: even when using `seq`.

Consider starting ghci as follows:

$ ghci -fobject-code

For this, see Compiling to object code inside GHCi.

4 See also