ANN: hstats 0.1

apfelmus apfelmus at quantentunnel.de
Wed Sep 19 04:29:58 EDT 2007


Marshall Beddoe wrote:
> http://hackage.haskell.org/cgi-bin/hackage-scripts/package/hstats-0.1
> 
> I've just released hstats, a statistical computing module for the Haskell
> language.  Current functionality includes: mean, median, mode, range,
> standard/average deviation, variance, iqr, kurtosis, skew, covariance, and
> correlation. I have plans on adding more rank correlation functions,
> histograms & chi square tests.

Nice, I often prefer Haskell over gnuplot for quick & dirty data analysis :)

An additional feature would be (linear) regression. Oh, and "automatic 
error bounds", i.e. representing a value together with its standard 
derivation much like automatic differentiation

   data Floating a => Value a = V { value :: a, stdderiv :: a}

and arithmetic operations like

   x + y = V (value x + value y) (sqrt $ stdderiv x ^2 + stdderiv y ^2)

The drawback is that the usual laws for +,- and * are broken here, so 
care has to be taken for choosing formulas.

Also, are the formulas for  mean  etc. numerically stable? I don't 
remember very well, but I think I've read that

   mean =
     foldl' (\(!n,!s) x -> (n+1, s+(x-s)/(fromIntegral $ n+1)) (0,0)

is better than the standard  sum xs / length xs  formula. Or at least 
that the latter is not good.

Regards,
apfelmus



More information about the Libraries mailing list