# Automatic Differentiation

(Difference between revisions)
 Revision as of 21:26, 12 May 2011 (edit)← Previous diff Revision as of 21:38, 12 May 2011 (edit) (undo)Next diff → Line 1: Line 1: - '''Automatic Differentiation''' roughly means that a numerical value is equipped with a derivative part, + '''Automatic Differentiation''' enables you to compute both the value of a function and its derivative(s) at the same time. - which is updated accordingly on every function application. + + When using '''Forward Mode''' this roughly means that a numerical value is equipped with its derivative with respect to one of your input, which is updated accordingly on every function application. Let the number $x_0$ be equipped with the derivative $x_1$: $\langle x_0,x_1 \rangle$. Let the number $x_0$ be equipped with the derivative $x_1$: $\langle x_0,x_1 \rangle$. For example the sinus is defined as: For example the sinus is defined as: * $\sin\langle x_0,x_1 \rangle = \langle \sin x_0, x_1\cdot\cos x_0\rangle$ * $\sin\langle x_0,x_1 \rangle = \langle \sin x_0, x_1\cdot\cos x_0\rangle$ - You see, that's just estimating errors as in physics. + - However, it becomes more interesting for vector functions. + Replacing this single derivative with a lazy list of them can enable you to compute an entire derivative tower at the same time. + + However, it becomes more difficult for vector functions, when computing the derivatives in reverse, when computing towers, and/or when trying to minimize the number of computations needed to compute all of the kth partial derivatives of an n-ary function. + + Forward mode is suitable when you have fewer arguments than outputs, because it requires multiple applications of the function, one for each input. + + Reverse mode is suitable when you have fewer results than inputs, because it requires multiple applications of the function, one for each output. Implementations: Implementations: - * [http://hackage.haskell.org/cgi-bin/hackage-scripts/package/ad ad] + * [http://hackage.haskell.org/cgi-bin/hackage-scripts/package/ad ad] (forward, reverse and other modes) - * [http://hackage.haskell.org/cgi-bin/hackage-scripts/package/fad fad] + * [http://hackage.haskell.org/cgi-bin/hackage-scripts/package/fad fad] (forward mode) - * [http://hackage.haskell.org/cgi-bin/hackage-scripts/package/rad rad] + * [http://hackage.haskell.org/cgi-bin/hackage-scripts/package/rad rad] (reverse mode) - * [[Vector-space]] + * [[Vector-space]] (forward mode) - * http://comonad.com/haskell/monoids/dist/doc/html/monoids/Data-Ring-Module-AutomaticDifferentiation.html + * http://comonad.com/haskell/monoids/dist/doc/html/monoids/Data-Ring-Module-AutomaticDifferentiation.html (forward mode) == Power Series == == Power Series == - You may count arithmetic with power series also as Automatic Differentiation, + If you can compute all of the derivatives of a function, you can compute Taylor series from it. - since this means just working with all derivatives simultaneously. + Implementation with Haskell 98 type classes: Implementation with Haskell 98 type classes: Line 30: Line 36: * [[Functional differentiation]] * [[Functional differentiation]] * Chris Smith in Haskell-cafe on [http://www.haskell.org/pipermail/haskell-cafe/2007-November/035477.html Hit a wall with the type system] * Chris Smith in Haskell-cafe on [http://www.haskell.org/pipermail/haskell-cafe/2007-November/035477.html Hit a wall with the type system] + * Edward Kmett in StackOverflow on [http://stackoverflow.com/questions/2744973/is-there-any-working-implementation-of-reverse-mode-automatic-differentiation-for Is there any working implementation of reverse mode automatic differentiation for Haskell?] + * Edward Kmett in Comonad.Reader on [http://comonad.com/reader/2010/reverse-mode-automatic-differentiation-in-haskell/ Reverse Mode Automatic Differentiation in Haskell] [[Category:Mathematics]] [[Category:Mathematics]]

## Revision as of 21:38, 12 May 2011

Automatic Differentiation enables you to compute both the value of a function and its derivative(s) at the same time.

When using Forward Mode this roughly means that a numerical value is equipped with its derivative with respect to one of your input, which is updated accordingly on every function application. Let the number x0 be equipped with the derivative x1: $\langle x_0,x_1 \rangle$. For example the sinus is defined as:

• $\sin\langle x_0,x_1 \rangle = \langle \sin x_0, x_1\cdot\cos x_0\rangle$

Replacing this single derivative with a lazy list of them can enable you to compute an entire derivative tower at the same time.

However, it becomes more difficult for vector functions, when computing the derivatives in reverse, when computing towers, and/or when trying to minimize the number of computations needed to compute all of the kth partial derivatives of an n-ary function.

Forward mode is suitable when you have fewer arguments than outputs, because it requires multiple applications of the function, one for each input.

Reverse mode is suitable when you have fewer results than inputs, because it requires multiple applications of the function, one for each output.

Implementations:

## 1 Power Series

If you can compute all of the derivatives of a function, you can compute Taylor series from it.