genBits: small addition to random API + largely new implementation that needs code review
Tyson Whitehead
twhitehead at gmail.com
Tue Feb 7 17:27:23 CET 2012
On June 28, 2011 18:01:52 Thomas DuBuisson wrote:
> Ryan Newton <rrnewton at gmail.com> wrote:
> > I'd also be happy to hear other proposals for API changes and additions.
>
> As I say below, I don't understand why we can't use ByteString. There
> are (or should be) rather fast decodings of all the popular primitive
> types from bytestrings, so this takes care of our polymorphism issue.
>
> I accept that, unlike the CryptoRandomGen, we don't want an explicit
> failure for RandomGen. But how about a common interface for
> instantiating generators ('newGen')? Is there a reason not to have that?
>
> --- CODE ---
> class RandomGen g where
> next :: g -> (Int, g)
> nextBits :: g -> BitLength -> (ByteString, g)
> genBits :: g -> Int
> newGen :: ByteString -> g
> --- END CODE ---
I would think the interface that fits an arbitrary generators best would be
max :: w
next :: g -> (w, g)
where w is the internal word type depedent on g (e.g., Word8, Word32, etc.),
and next returns a uniform draw on [0,max] in g.
You could then wrap these with routines that split and/or accumulate
randomness to fill requests for uniform values on a specified range. A bit
generator is then just a request for one on [0,1].
An alternative would be to push the routines that split and/or accumulate
randomness to fill requests for uniform value on a specified range into the
generator itself. This way you wouldn't have to expose the internal type and
could avoid the depedent type. The interface would then be
next :: Integral w => w -> (w, g)
where the desired range is specified by the first arguement.
The reason I bring this up is in having looked at a few random number
generators, it seems many do not actually operate on a nice 2^n range (zero
and requirements for primes keep getting in the way of this).
As frequently we don't actually want a 2^n uniform random number anyway, you
windup with a double accept/reject procedure. Once to get in the 2^n range
(which has to reject draws that fall in the tail), and once again to get you
in the real range (further rejecting draws that fall in its tail).
Why not just go from whatever the generators natural range is to the desired
range in a single step? Presumably this could be coded so the compiler
eliminates the overhead for the case where these two are the same?
Cheers! -Tyson
More information about the Cvs-ghc
mailing list