Personal tools

Performance/Floating point

From HaskellWiki

< Performance(Difference between revisions)
Jump to: navigation, search
m (Performance:Floating Point moved to Performance/Floating Point)
(Haskell code markup)
 
(6 intermediate revisions by 6 users not shown)
Line 1: Line 1:
 
{{Performance infobox}}
 
{{Performance infobox}}
== Don't use <tt>Float</tt> ==
+
[[Category:Performance|Floating point]]
  +
== Don't use <hask>Float</hask> ==
   
<tt>Floats</tt> (probably 32-bits) are almost always a bad idea, anyway, unless you Really Know What You Are Doing. Use <tt>Double</tt>s. There's rarely a speed disadvantage&mdash;modern machines will use the same floating-point unit for both. With <tt>Double</tt>s, you are much less likely to hang yourself with numerical errors.
+
<hask>Float</hask>s (probably 32-bits) are almost always a bad idea, unless you Really Know What You Are Doing. Use <hask>Double</hask>s. There's rarely a speed disadvantage&mdash;modern machines will use the same floating-point unit for both. With <hask>Double</hask>s, you are much less likely to hang yourself with numerical errors.
   
One time when <tt>Float</tt> might be a good idea is if you have a ''lot'' of them, say a giant array of <tt>Float</tt>s. An unboxed array of <tt>Float</tt> (see [[Performance:Arrays]]) takes up half the space in the heap compared to an unboxed array of <tt>Double</tt>. However, boxed <tt>Float</tt>s will only take up less space than boxed <tt>Double</tt>s if you are on a 32-bit machine (on a 64-bit machine, a <tt>Float</tt> still takes up 64 bits).
+
One time when <hask>Float</hask> might be a good idea is if you have a ''lot'' of them, say a giant array of <hask>Float</hask>s. An unboxed array of <hask>Float</hask> (see [[Performance/Arrays]]) takes up half the space in the heap compared to an unboxed array of <hask>Double</hask>. However, boxed <hask>Float</hask>s will only take up less space than boxed <hask>Double</hask>s if you are on a 32-bit machine (on a 64-bit machine, a <hask>Float</hask> still takes up 64 bits).
  +
  +
The speed claims may not be true due to Doubles not necessarily being
  +
aligned as the machine wishes. We could do with some benchmarking on various platforms to see what's what.
  +
  +
== GHC-specific advice ==
  +
  +
On x86 (and other platforms with GHC prior to version 6.4.2), use the <tt>-fexcess-precision</tt> flag to improve performance of floating-point intensive code (up to 2x speedups have been seen). This will keep more intermediates in registers instead of memory, at the expense of occasional differences in results due to unpredictable rounding. See the [http://www.haskell.org/ghc/docs/latest/html/users_guide/options-optimise.html#options-f GHC documentation] for more details. Switching on GCCs <tt>-ffast-math</tt> and <tt>-O3</tt> can also help (use <tt>-optc-ffast-math</tt> and <tt>-optc-O3</tt>).
  +
  +
Where available, the <tt>-optc-march=pentium4 -optc-mfpmath=sse</tt> flags may also help.
  +
  +
Note that the <tt>-fexcess-precision</tt> flag may make programs behave oddly,
  +
e.g. after falling an <hask>if x < 0</hask> branch you may find that <hask>x</hask> is now not less than zero, as it has been written out to memory and thus some precision lost in the mean time.

Latest revision as of 08:34, 15 June 2007

Haskell Performance Resource

Constructs:
Data Types - Functions
Overloading - FFI - Arrays
Strings - Integers - I/O
Floating point - Concurrency
Modules - Monads

Techniques:
Strictness - Laziness
Avoiding space leaks
Accumulating parameter

Implementation-Specific:
GHC - nhc98 - Hugs
Yhc - JHC

[edit] 1 Don't use
Float

Float
s (probably 32-bits) are almost always a bad idea, unless you Really Know What You Are Doing. Use
Double
s. There's rarely a speed disadvantage—modern machines will use the same floating-point unit for both. With
Double
s, you are much less likely to hang yourself with numerical errors. One time when
Float
might be a good idea is if you have a lot of them, say a giant array of
Float
s. An unboxed array of
Float
(see Performance/Arrays) takes up half the space in the heap compared to an unboxed array of
Double
. However, boxed
Float
s will only take up less space than boxed
Double
s if you are on a 32-bit machine (on a 64-bit machine, a
Float
still takes up 64 bits).

The speed claims may not be true due to Doubles not necessarily being aligned as the machine wishes. We could do with some benchmarking on various platforms to see what's what.

[edit] 2 GHC-specific advice

On x86 (and other platforms with GHC prior to version 6.4.2), use the -fexcess-precision flag to improve performance of floating-point intensive code (up to 2x speedups have been seen). This will keep more intermediates in registers instead of memory, at the expense of occasional differences in results due to unpredictable rounding. See the GHC documentation for more details. Switching on GCCs -ffast-math and -O3 can also help (use -optc-ffast-math and -optc-O3).

Where available, the -optc-march=pentium4 -optc-mfpmath=sse flags may also help.

Note that the -fexcess-precision flag may make programs behave oddly,

e.g. after falling an
if x < 0
branch you may find that
x
is now not less than zero, as it has been written out to memory and thus some precision lost in the mean time.