I just released version 0.3.3 of the Haskell statistics library, which contains a very fast pseudo-random number generator.
The generator is an implementation of George Marsaglia’s MWC256 multiply-with-carry PRNG, which has a period of 28222 (for this reason, it’s sometimes referred to as MWC8222). It produces high-quality uniformly distributed pseudo-random numbers extremely quickly.
Here is a brief performance comparison between the statistics library, the mersenne-random library, the normal Haskell PRNG, and a C implementation of MWC256. The numbers represent millions of variates generated and summed per second, so higher is better. (Measured on a 64-bit Core2 Duo laptop running Fedora 11.)
|C (inlined, unrolled)||186.5||171.7||571.2|
As the numbers indicate, the MWC256 implementation in the statistics library is up to 3 times faster than the mersenne-random library, and often faster than the C code you’d expect to encounter in practice. And you shouldn’t use the standard Haskell PRNG if you care about performance.
(The last two rows in the table above indicate what you could expect from an aggressive C compiler that performed cross-module inlining and loop unrolling, such as almost works reliably in gcc these days. If you were using normal compilation flags and source file structure, you would be unlikely to see those kinds of numbers in practice.)
A nice feature of the PRNG in the statistics library is that it has a cleaner interface than mersenne-random. Due to the cheesy implementation of the underlying C code, the mersenne-random library enforces the use of a single PRNG for the entire program, which makes it onerous to work with. You have to create a single generator when your program starts, and pass it everywhere that you might eventually need random numbers. In a multi-threaded program, only one thread at a time can use the PRNG.
The PRNG in the statistics library runs inside the ST monad, so you can generate random variates from inside pure Haskell code. You can create or use a generator anywhere without worrying about threading issues, and there’s a handy function provided that lets you seed a generator from your system’s random number source.
As a final tweak, the new PRNG generates floating point numbers in the range (0,1], so the variates it generates are safe for using in statistical computations that are zero-phobic, such as those involving logarithms or division. I’ve also included a generator for normally-distributed floating point numbers that uses Doornik’s modified ziggurat algorithm, so it’s both fast and trustworthy.