[Haskell-cafe] Testing for statistical properties

Gregory Crosswhite gcross at phys.washington.edu
Fri Jan 8 14:47:15 EST 2010


Thanks!  I had reached the same conclusion, so I am glad to see that you already wrote code to do this for me.  :-)  There is a bug in the version that you posted, though:  you missed one of the terms in the u < 0.755 case, so the c2 constant goes completely unused.  Here is my modified version of your function + bug fix, with some stylistic tweaks:

computeKolmogorovProbability :: Double -> Double
computeKolmogorovProbability z
   | u < 0.2
      = 1
   | u < 0.755
      = 1 - w * (exp(c1/v)+exp(c2/v)+exp(c3/v))/u
   | u < 6.8116
      = 2 * sum [ sign * exp(coef*v)
                | (sign,coef) <- take (1 `max` round (3/u)) coefs
                ]
   | otherwise
      = 0
  where
    u = abs z
    v = u*u
    w = 2.50662827
    c1 = -pi**2/8
    c2 = 9*c1
    c3 = 25*c1
    coefs = [(1,-2),(-1,-8),(1,-18),(-1,-32)]

Cheers,
Greg

On Jan 8, 2010, at 2:59 AM, Tom Nielsen wrote:

> Hi Greg,
> 
> Assuming this is a one-dimensional distribtution, you should use a
> kolmogorov-smirnov test to test this:
> 
> http://en.wikipedia.org/wiki/Kolmogorov-Smirnov_test
> 
> I've implemented to the KS distribution from the CERN code linked in
> the wikipedia article, here:
> 
> http://github.com/glutamate/samfun/blob/master/Math/Probably/KS.hs
> 
> (warning, i wasn't able to verify the numbers coming out against
> anything so just check that it makes sense)
> 
> So all you have to do is to find the maximal distance between your
> samples and the cumulative density function, multiply by the sqrt. of
> of the number of samples, and calculate kprob on that.
> 
> I don't think you can do this in a Bayesian way because you can't
> enumerate all the other distributions your samples could come from?
> 
> Tom
> 
> On Thu, Jan 7, 2010 at 9:31 PM, Gregory Crosswhite
> <gcross at phys.washington.edu> wrote:
>> Hey everyone!  I have some computations that satisfy statistical properties which I would like to test --- that is, the result of the computation is non-deterministic, but I want to check that it is sampling the distribution that it should be sampling.  Is anyone aware of a Haskell library out there that does anything like this?
>> 
>> Cheers,
>> Greg
>> 
>> _______________________________________________
>> Haskell-Cafe mailing list
>> Haskell-Cafe at haskell.org
>> http://www.haskell.org/mailman/listinfo/haskell-cafe
>> 



More information about the Haskell-Cafe mailing list