GHC optimisations
From HaskellWiki
(Initial version; still working on this...) |
(More content. Still not finished...) |
||
| Line 58: | Line 58: | ||
(How does GHC determine 'small'? Isn't there a switch that adjusts this?) | (How does GHC determine 'small'? Isn't there a switch that adjusts this?) | ||
| + | |||
| + | == Execution Model == | ||
| + | |||
| + | In order to understand how to write efficient code, and what GHC does with your code to optimise it, it helps to know a bit about what your compiled code looks like and how it works. | ||
| + | |||
| + | === Graph reduction === | ||
| + | |||
| + | To a first approximation, at any moment your program is a 'graph' of objects in memory. ('Graph' in the graph theory sense --- nodes connected by arcs.) Some of the objects are 'data' --- booleans, integers, strings, lists, etc. Some of those objects are functions (because Haskell lets you pass functions around like data). And some of these are ''thunks'' --- unevaluated expressions (because Haskell only evaluates expressions 'as needed'). | ||
| + | |||
| + | The program starts off with a single node representing the unevaluated call to <hask>main</hask>, and proceeds to execute from there. Each time a thunk is executed, the result (whatever it is) overwrites the thunk data. (It's possible that the result of evaluating a thunk is a new thunk of course.) | ||
| + | |||
| + | === About STG === | ||
| + | |||
| + | GHC compiles to the ''spineless tagness G-machine'' (STG). (Finish me!) | ||
Revision as of 10:49, 11 August 2007
Contents |
1 Introduction
This page collects together information about the optimisations that GHC does and does not perform.
- GHC experts: Please check that the info in this page is correct.
- Everybody else: Feel free to add questions!
2 General optimisations
2.1 Common subexpression elimination
First of all, common subexpression elemination (CSE) means that if an expression appears in several places, the code is rearranged so that the value of that expression is computed only once. For example:
foo x = (bar x) * (bar x)
might be transformed into
foo x = let x' = bar x in x' * x'
GHC doesn't actually perform CSE as often as you might expect. The trouble is, performing CSE can affect the strictness/lazyness of the program. So GHC does do CSE, but only in specific circumstances --- see the GHC manual. (Section??)
Long story short: "If you care about CSE, do it by hand."
2.2 Inlining
Inlining is where a function call is replaced by that function's definition. For example, the standardmap :: (a -> b) -> [a] -> [b] map f [] = [] map f (x:xs) = f x : map f xs
Now if you write something like
foo = map bar
foo [] = [] foo (x:xs) = bar x : foo xs
So, that's what inlining is. By default, GHC will inline things if they are 'small enough'. Every time you inline a function, you are in a sense making a (customised) copy of that function. Do too much of this and the compiled program will be enormous. So it's only worth it for 'small' functions.
(How does GHC determine 'small'? Isn't there a switch that adjusts this?)
3 Execution Model
In order to understand how to write efficient code, and what GHC does with your code to optimise it, it helps to know a bit about what your compiled code looks like and how it works.
3.1 Graph reduction
To a first approximation, at any moment your program is a 'graph' of objects in memory. ('Graph' in the graph theory sense --- nodes connected by arcs.) Some of the objects are 'data' --- booleans, integers, strings, lists, etc. Some of those objects are functions (because Haskell lets you pass functions around like data). And some of these are thunks --- unevaluated expressions (because Haskell only evaluates expressions 'as needed').
The program starts off with a single node representing the unevaluated call to3.2 About STG
GHC compiles to the spineless tagness G-machine (STG). (Finish me!)
